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The  momentous  revolution  in  science  precipitated  by  Isaac  Newton’s  calculus  soon  re¬ 
vealed  the  central  role  of  partial  differential  equations  throughout  mathematics  and  its 
manifold  applications.  Notable  examples  of  fundamental  physical  phenomena  modeled 
by  partial  differential  equations,  most  of  which  are  named  after  their  discoverers  or  early 
proponents,  include  quantum  mechanics  (Schrodinger,  Dirac),  relativity  (Einstein),  electro¬ 
magnetism  (Maxwell),  optics  (eikonal,  Maxwell-Bloch,  nonlinear  Schrodinger),  fluid  me¬ 
chanics  (Euler,  Navier-Stokes,  Korteweg-de  Vries,  Kadomstev-Petviashvili) ,  superconduc¬ 
tivity  (Ginzburg-Landau),  plasmas  (Vlasov),  magneto- hydrodynamics  (Navier-Stokes  + 
Maxwell),  elasticity  (Lame,  von  Karman),  thermodynamics  (heat),  chemical  reactions 
(Kolmogorov-Petrovsky-Piskounov),  finance  (Black-Scholes),  neuroscience  (FitzHugh- 
Nagumo),  and  many,  many  more.  The  challenge  is  that,  while  their  derivation  as  physi¬ 
cal  models  —  classical,  quantum,  and  relativistic  —  is,  for  the  most  part,  well  established, 
57,  69],  most  of  the  resulting  partial  differential  equations  are  notoriously  difficult  to  solve, 
and  only  a  small  handful  can  be  deemed  to  be  completely  understood.  In  many  cases,  the 
only  means  of  calculating  and  understanding  their  solutions  is  through  the  design  of  so¬ 
phisticated  numerical  approximation  schemes,  an  important  and  active  subject  in  its  own 
right.  However,  one  cannot  make  serious  progress  on  their  numerical  aspects  without  a 
deep  understanding  of  the  underlying  analytical  properties,  and  thus  the  analytical  and 
numerical  approaches  to  the  subject  are  inextricably  intertwined. 

This  textbook  is  designed  for  a  one-year  course  covering  the  fundamentals  of  partial 
differential  equations,  geared  towards  advanced  undergraduates  and  beginning  graduate 
students  in  mathematics,  science,  and  engineering.  No  previous  experience  with  the  subject 
is  assumed,  while  the  mathematical  prerequisites  for  embarking  on  this  course  of  study 
will  be  listed  below.  For  many  years,  I  have  been  teaching  such  a  course  to  students 
from  mathematics,  physics,  engineering,  statistics,  chemistry,  and,  more  recently,  biology, 
finance,  economics,  and  elsewhere.  Over  time,  I  realized  that  there  is  a  genuine  need  for 
a  well-written,  systematic,  modern  introduction  to  the  basic  theory,  solution  techniques, 
qualitative  properties,  and  numerical  approximation  schemes  for  the  principal  varieties  of 
partial  differential  equations  that  one  encounters  in  both  mathematics  and  applications.  It 
is  my  hope  that  this  book  will  fill  this  need,  and  thus  help  to  educate  and  inspire  the  next 
generation  of  students,  researchers,  and  practitioners. 

While  the  classical  topics  of  separation  of  variables,  Fourier  analysis,  Green’s  functions, 
and  special  functions  continue  to  form  the  core  of  an  introductory  course,  the  inclusion 
of  nonlinear  equations,  shock  wave  dynamics,  dispersion,  symmetry  and  similarity  meth¬ 
ods,  the  Maximum  Principle,  Huygens’  Principle,  quantum  mechanics  and  the  Schrodinger 
equation,  and  mathematical  finance  makes  this  book  more  in  tune  with  recent  developments 
and  trends.  Numerical  approximation  schemes  should  also  play  an  essential  role  in  an  in¬ 
troductory  course,  and  this  text  covers  the  two  most  basic  approaches:  finite  differences 
and  finite  elements. 
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On  the  other  hand,  modeling  and  the  derivation  of  equations  from  physical  phenomena 
and  principles,  while  not  entirely  absent,  has  been  downplayed,  not  because  it  is  unimpor¬ 
tant,  but  because  time  constraints  limit  what  one  can  reasonably  cover  in  an  academic 
year’s  course.  My  own  belief  is  that  the  primary  purpose  of  a  course  in  partial  differential 
equations  is  to  learn  the  principal  solution  techniques  and  to  understand  the  underlying 
mathematical  analysis.  Thus,  time  devoted  to  modeling  effectively  lessens  what  can  be  ad¬ 
equately  covered  in  the  remainder  of  the  course.  For  this  reason,  modeling  is  better  left  to 
a  separate  course  that  covers  a  wider  range  of  mathematics,  albeit  at  a  more  cursory  level. 
(Modeling  texts  worth  consulting  include  [57,69].)  Nevertheless,  this  book  continually 
makes  contact  with  the  physical  applications  that  spawn  the  partial  differential  equations 
under  consideration,  and  appeals  to  physical  intuition  and  familiar  phenomena  to  motivate, 
predict,  and  understand  their  mathematical  properties,  solutions,  and  applications.  Nor 
do  I  attempt  to  cover  stochastic  differential  equations  —  see  [83]  for  this  increasingly  im¬ 
portant  area  —  although  I  do  work  through  one  important  by-product:  the  Black-Scholes 
equation,  which  underlies  the  modern  financial  industry.  I  have  tried  throughout  to  bal¬ 
ance  rigor  and  intuition,  thus  giving  the  instructor  flexibility  with  their  relative  emphasis 
and  time  to  devote  to  solution  techniques  versus  theoretical  developments. 

The  course  material  has  now  been  developed,  tested,  and  revised  over  the  past  six  years 
here  at  the  University  of  Minnesota,  and  has  also  been  used  by  several  other  universities  in 
both  the  United  States  and  abroad.  It  consists  of  twelve  chapters  along  with  two  appendices 
that  review  basic  complex  numbers  and  some  essential  linear  algebra.  See  below  for  further 
details  on  chapter  contents  and  dependencies,  and  suggestions  for  possible  semester  and 
year-long  courses  that  can  be  taught  from  the  book. 


Prerequisites 

The  initial  prerequisite  is  a  reasonable  level  of  mathematical  sophistication,  which  includes 
the  ability  to  assimilate  abstract  constructions  and  apply  them  in  concrete  situations. 
Some  physical  insight  and  familiarity  with  basic  mechanics,  continuum  physics,  elemen¬ 
tary  thermodynamics,  and,  occasionally,  quantum  mechanics  is  also  very  helpful,  but  not 
essential. 

Since  partial  differential  equations  involve  the  partial  derivatives  of  functions,  the  most 
fundamental  prerequisite  is  calculus  —  both  univariate  and  multivariate.  Fluency  in  the 
basics  of  differentiation,  integration,  and  vector  analysis  is  absolutely  essential.  Thus,  the 
student  should  be  at  ease  with  limits,  including  one-sided  limits,  continuity,  differentiation, 
integration,  and  the  Fundamental  Theorem.  Key  techniques  include  the  chain  rule,  product 
rule,  and  quotient  rule  for  differentiation,  integration  by  parts,  and  change  of  variables  in 
integrals.  In  addition,  I  assume  some  basic  understanding  of  the  convergence  of  sequences 
and  series,  including  the  standard  tests  —  ratio,  root,  integral  —  along  with  Taylor’s 
theorem  and  elementary  properties  of  power  series.  (On  the  other  hand,  Fourier  series  will 
be  developed  from  scratch.) 

When  dealing  with  several  space  dimensions,  some  familiarity  with  the  key  construc¬ 
tions  and  results  from  two-  and  three-dimensional  vector  calculus  is  helpful:  rectangular 
(Cartesian),  polar,  cylindrical,  and  spherical  coordinates;  dot  and  cross  products;  partial 
derivatives;  the  multivariate  chain  rule;  gradient,  divergence,  and  curl;  parametrized  curves 
and  surfaces;  double  and  triple  integrals;  line  and  surface  integrals,  culminating  in  Green’s 
Theorem  and  the  Divergence  Theorem  —  as  well  as  very  basic  point  set  topology:  notions  of 
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open,  closed,  bounded,  and  compact  subsets  of  Euclidean  space;  the  boundary  of  a  domain 
and  its  normal  direction;  etc.  However,  all  the  required  concepts  and  results  will  be  quickly 
reviewed  in  the  text  at  the  appropriate  juncture:  Section  6.3  covers  the  two-dimensional 
material,  while  Section  12.1  deals  with  the  three-dimensional  counterpart. 


Many  solution  techniques  for  partial  differential  equations,  e.g.,  separation  of  variables 
and  symmetry  methods,  rely  on  reducing  them  to  one  or  more  ordinary  differential  equa¬ 
tions.  In  order  to  make  progress,  the  student  should  therefore  already  know  how  to  find 
the  general  solution  to  first-order  linear  equations,  both  homogeneous  and  inhomogeneous, 
along  with  separable  nonlinear  first-order  equations,  linear  constant-coefficient  equations, 
particularly  those  of  second  order,  and  first-order  linear  systems  with  constant-coefficient 
matrices,  in  particular  the  role  of  eigenvalues  and  the  construction  of  a  basis  of  solutions. 
The  student  should  also  be  familiar  with  initial  value  problems,  including  statements  of 
the  basic  existence  and  uniqueness  theorems,  but  not  necessarily  their  proofs.  Basic  ref¬ 
erences  include  [18,  20,  23],  while  more  advanced  topics  can  be  found  in  [52,  54,  59].  On 
the  other  hand,  while  boundary  value  problems  for  ordinary  differential  equations  play  a 
central  role  in  the  analysis  of  partial  differential  equations,  the  book  does  not  assume  any 
prior  experience,  and  will  develop  solution  techniques  from  the  beginning. 


Students  should  also  be  familiar  with  the  basics  of  complex  numbers,  including  real 
and  imaginary  parts;  modulus  and  phase  (or  argument);  and  complex  exponentials  and 
Euler’s  formula.  These  are  reviewed  in  Appendix  A.  In  the  numerical  chapters,  some 
familiarity  with  basic  computer  arithmetic,  i.e.,  floating-point  and  round-off  errors,  is  as¬ 
sumed.  Also,  on  occasion,  basic  numerical  root  finding  algorithms,  e.g.,  Newton’s  Method; 
numerical  linear  algebra,  e.g.,  Gaussian  Elimination  and  basic  iterative  methods;  and  nu¬ 
merical  solution  schemes  for  ordinary  differential  equations,  e.g.,  Runge-Kutta  Methods, 
are  mentioned.  Students  who  have  forgotten  the  details  can  consult  a  basic  numerical 
analysis  textbook,  e.g.,  [24,  60],  or  reference  volume,  e.g.,  [94]. 

Finally,  knowledge  of  the  basic  results  and  conceptual  framework  provided  by  modern 
linear  algebra  will  be  essential  throughout  the  text.  Students  should  already  be  on  familiar 
terms  with  the  fundamental  concepts  of  vector  space,  both  finite-  and  infinite-dimensional, 
linear  independence,  span,  and  basis,  inner  products,  orthogonality,  norms,  and  Cauchy- 
Schwarz  and  triangle  inequalities,  eigenvalues  and  eigenvectors,  determinants,  and  linear 
systems.  These  are  all  covered  in  Appendix  B;  a  more  comprehensive  and  recommended 
reference  is  my  previous  textbook,  [89],  coauthored  with  my  wife,  Cheri  Shakiban,  which 
provides  a  firm  grounding  in  the  key  ideas,  results,  and  methods  of  modern  applied  linear 
algebra.  Indeed,  Chapter  9  here  can  be  viewed  as  the  next  stage  in  the  general  linear 
algebraic  framework  that  has  proven  to  be  so  indispensable  for  the  modern  analysis  and 
numerics  of  not  just  linear  partial  differential  equations  but,  indeed,  all  of  contemporary 
pure  and  applied  mathematics. 


While  applications  and  solution  techniques  are  paramount,  the  text  does  not  shy  away 
from  precise  statements  of  theorems  and  their  proofs,  especially  when  these  help  shed 
light  on  the  applications  and  development  of  the  subject.  On  the  other  hand,  the  more 
advanced  results  that  require  analytical  sophistication  beyond  what  can  be  reasonably 
assumed  at  this  level  are  deferred  to  a  subsequent,  graduate-level  course.  In  particular, 
the  book  does  not  assume  that  the  student  has  taken  a  course  in  real  analysis,  and  hence, 
while  the  basic  ideas  underlying  Hilbert  space  are  explained  in  the  context  of  Fourier 
analysis,  knowledge  of  measure  theory  and  Lebesgue  integration  is  neither  assumed  nor 
used.  Consequently,  the  precise  definitions  of  Hilbert  space  and  generalized  functions 
(distributions)  are  necessarily  left  somewhat  vague,  with  the  level  of  detail  being  similar 
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to  that  found  in  a  basic  physics  course  on  quantum  mechanics.  Indeed,  one  of  the  goals  of 
the  course  is  to  inspire  mathematics  students  (and  others)  to  take  a  rigorous  real  analysis 
course,  because  it  is  so  indispensable  to  the  more  advanced  theory  and  applications  of 
partial  differential  equations  that  build  on  the  material  presented  here. 

Outline  of  Chapters 

The  first  chapter  is  brief  and  serves  to  set  the  stage,  introducing  some  basic  notation 
and  describing  what  is  meant  by  a  partial  differential  equation  and  a  (classical)  solution 
thereof.  It  then  describes  the  basic  structure  and  properties  of  linear  problems  in  a  general 
sense,  appealing  to  the  underlying  framework  of  linear  algebra  that  is  summarized  in  Ap¬ 
pendix  B.  In  particular,  the  fundamental  superposition  principles  for  both  homogeneous 
and  inhomogeneous  linear  equations  and  systems  are  employed  throughout. 

The  first  three  sections  of  Chapter  2  are  devoted  to  first-order  partial  differential  equa¬ 
tions  in  two  variables  —  time  and  a  single  space  coordinate  —  starting  with  simple  linear 
cases.  Constant-coefficient  equations  are  easily  solved,  leading  to  the  important  concepts 
of  characteristic  and  traveling  wave.  The  method  of  characteristics  is  then  extended,  ini¬ 
tially  to  linear  first-order  equations  with  variable  coefficients,  and  then  to  the  nonlinear 
case,  where  most  solutions  break  down  into  discontinuous  shock  waves,  whose  subsequent 
dynamics  relies  on  the  underlying  physics.  The  material  on  shocks  may  be  at  a  slightly 
higher  level  of  difficulty  than  the  instructor  wishes  to  deal  with  this  early  in  the  course, 
and  hence  may  be  downplayed  or  even  omitted,  perhaps  returned  to  at  a  later  stage,  e.g., 
when  studying  Burgers’  equation  in  Section  8.4,  or  when  the  concept  of  weak  solution 
is  introduced  in  Chapter  10.  The  final  section  of  Chapter  2  is  essential,  and  shows  how 
the  second-order  wave  equation  can  be  reduced  to  a  pair  of  first-order  partial  differential 
equations,  thereby  producing  the  celebrated  solution  formula  of  d’Alembert. 

Chapter  3  covers  the  essentials  of  Fourier  series,  which  is  the  most  important  tool  in 
our  analytical  arsenal.  After  motivating  the  subject  by  adapting  the  eigenvalue  method  for 
solving  linear  systems  of  ordinary  differential  equations  to  the  heat  equation,  the  remainder 
of  the  chapter  develops  basic  Fourier  series  analysis,  in  both  real  and  complex  forms.  The 
final  section  investigates  the  various  modes  of  convergence  of  Fourier  series:  pointwise, 
uniform,  in  norm.  Along  the  way,  Hilbert  space  and  completeness  are  introduced,  at 
an  appropriate  level  of  rigor.  Although  more  theoretical  than  most  of  the  material,  this 
section  is  nevertheless  strongly  recommended,  even  for  applications-oriented  students,  and 
can  serve  as  a  launching  pad  for  higher-level  analysis. 

Chapter  4  immediately  delves  into  the  application  of  Fourier  techniques  to  construct 
solutions  to  the  three  paradigmatic  second-order  partial  differential  equations  in  two  in¬ 
dependent  variables  —  the  heat,  wave,  and  Laplace/Poisson  equations  —  via  the  method 
of  separation  of  variables.  For  dynamical  problems,  the  separation  of  variables  approach 
reinforces  the  importance  of  eigenfunctions.  In  the  case  of  the  Laplace  equation,  separation 
is  performed  in  both  rectangular  and  polar  coordinates,  thereby  establishing  the  averaging 
property  of  solutions  and,  consequently,  the  Maximum  Principle  as  important  by-products. 
The  chapter  concludes  with  a  short  discussion  of  the  classification  of  second-order  partial 
differential  equations,  in  two  independent  variables,  into  parabolic,  hyperbolic,  and  elliptic 
categories,  emphasizing  their  disparate  natures  and  the  role  of  characteristics. 

Chapter  5  is  the  first  devoted  to  numerical  approximation  techniques  for  partial 
differential  equations.  Here  the  emphasis  is  on  finite  difference  methods.  All  of  the 
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preceding  cases  are  discussed:  heat  equation,  transport  equations,  wave  equation,  and 
Laplace/Poisson  equation.  The  student  learns  that,  in  contrast  to  the  held  of  ordinary 
differential  equations,  numerical  methods  must  be  specially  adapted  to  the  particularities 
of  the  partial  differential  equation  under  investigation,  and  may  well  not  converge  unless 
certain  stability  constraints  are  satisfied. 

Chapter  6  introduces  a  second  important  solution  method,  founded  on  the  notion  of  a 
Green’s  function.  Our  development  relies  on  the  use  of  distributions  (generalized  functions), 
concentrating  on  the  extremely  useful  “delta  function” ,  which  is  characterized  both  as  an 
unconventional  limit  of  ordinary  functions  and,  more  rigorously  but  more  abstractly,  by 
duality  in  function  space.  While,  as  with  Hilbert  space,  we  do  not  assume  familiarity 
with  the  analysis  tools  required  to  develop  the  fully  rigorous  theory  of  such  generalized 
functions,  the  aim  is  for  the  student  to  assimilate  the  basic  ideas  and  comfortably  work 
with  them  in  the  context  of  practical  examples.  With  this  in  hand,  the  Green’s  function 
approach  is  then  first  developed  in  the  context  of  boundary  value  problems  for  ordinary 
differential  equations,  followed  by  consideration  of  elliptic  boundary  value  problems  for  the 
Poisson  equation  in  the  plane. 

Chapter  7  returns  to  Fourier  analysis,  now  over  the  entire  real  line,  resulting  in  the 
Fourier  transform.  Applications  to  boundary  value  problems  are  followed  by  a  further 
development  of  Hilbert  space  and  its  role  in  modern  quantum  mechanics.  Our  discussion 
culminates  with  the  Heisenberg  Uncertainty  Principle,  which  is  viewed  as  a  mathematical 
property  of  the  Fourier  transform.  Space  and  time  considerations  persuaded  me  not  to 
press  on  to  develop  the  Laplace  transform,  which  is  a  special  case  of  the  Fourier  transform, 
although  it  can  be  profitably  employed  to  study  initial  value  problems  for  both  ordinary 
and  partial  differential  equations. 

Chapter  8  integrates  and  further  develops  several  different  themes  that  arise  in  the 
analysis  of  dynamical  evolution  equations,  both  linear  and  nonlinear.  The  first  section 
introduces  the  fundamental  solution  for  the  heat  equation,  and  describes  applications  in 
mathematical  finance  through  the  celebrated  Black-Scholes  equation.  The  second  section 
is  a  brief  discussion  of  symmetry  methods  for  partial  differential  equations,  a  favorite  topic 
of  the  author  and  the  subject  of  his  graduate-level  monograph  [87].  Section  8.3  introduces 
the  Maximum  Principle  for  the  heat  equation,  an  important  tool,  inspired  by  physics,  in 
the  advanced  analysis  of  parabolic  problems.  The  last  two  sections  study  two  basic  higher- 
order  nonlinear  equations.  Burgers’  equation  combines  dissipative  and  nonlinear  effects, 
and  can  be  regarded  as  a  simplified  model  of  viscous  fluid  mechanics.  Interestingly,  Burg¬ 
ers’  equation  can  be  explicitly  solved  by  transforming  it  into  the  linear  heat  equation.  The 
convergence  of  its  solutions  to  the  shock- wave  solutions  of  the  limiting  nonlinear  transport 
equation  underlies  the  modern  analytic  method  of  viscosity  solutions.  The  final  section 
treats  basic  third-order  linear  and  nonlinear  evolution  equations  arising,  for  example,  in 
the  modeling  of  surface  waves.  The  linear  equation  serves  to  introduce  the  phenomenon  of 
dispersion,  in  which  different  Fourier  modes  move  at  different  velocities,  producing  com¬ 
mon  physical  effects  observed  in,  for  instance,  water  waves.  We  also  highlight  the  recently 
discovered  and  fascinating  Talbot  effect  of  dispersive  quantization  and  fractalization  on 
periodic  domains.  The  nonlinear  Korteweg-de  Vries  equation  has  many  remarkable  prop¬ 
erties,  including  localized  soliton  solutions,  first  discovered  in  the  1960s,  that  result  from 
its  status  as  a  completely  integrable  system. 

Before  proceeding  further,  Chapter  9  takes  time  to  formulate  a  general  abstract  frame¬ 
work  that  underlies  much  of  the  more  advanced  analysis  of  linear  partial  differential  equa¬ 
tions.  The  material  is  at  a  slightly  higher  level  of  abstraction  (although  amply  illustrated 
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by  concrete  examples),  so  the  more  computationally  oriented  reader  may  wish  to  skip 
ahead  to  the  last  two  chapters,  referring  back  to  the  relevant  concepts  and  general  re¬ 
sults  in  particular  contexts  as  needed.  Nevertheless,  I  strongly  recommend  covering  at 
least  some  of  this  chapter,  both  because  the  framework  is  important  to  understanding  the 
commonalities  among  various  concrete  instantiations,  and  because  it  demonstrates  the  per¬ 
vasive  power  of  mathematical  analysis,  even  for  those  whose  ultimate  goal  is  applications. 
The  development  commences  with  the  adjoint  of  a  linear  operator  between  inner  product 
spaces  —  a  powerful  and  far-ranging  generalization  of  the  matrix  transpose  —  which  nat¬ 
urally  leads  to  consideration  of  self-adjoint  and  positive  definite  operators,  all  illustrated 
by  finite-dimensional  linear  algebraic  systems  and  boundary  value  problems  governed  by 
ordinary  and  partial  differential  equations.  A  particularly  important  construction,  forming 
the  foundation  of  the  finite  element  numerical  method,  is  the  characterization  of  solutions 
to  positive  definite  boundary  value  problems  via  minimization  principles.  Next,  general 
results  concerning  eigenvalues  and  eigenfunctions  of  self-adjoint  and  positive  definite  op¬ 
erators  are  established,  which  serve  to  explain  the  key  features  of  reality,  orthogonality, 
and  completeness  that  underlie  Fourier  and  more  general  eigenfunction  series  expansions. 
A  general  characterization  of  complete  eigenfunction  systems  based  on  properties  of  the 
Green’s  function  nicely  ties  together  two  of  the  principal  themes  of  the  text. 

Chapter  10  returns  to  the  numerical  analysis  of  partial  differential  equations,  intro¬ 
ducing  the  powerful  finite  element  method.  After  outlining  the  general  construction  based 
on  the  preceding  abstract  minimization  principle,  we  present  its  practical  implementation, 
first  for  one-dimensional  boundary  value  problems  governed  by  ordinary  differential  equa¬ 
tions  and  then  for  elliptic  boundary  value  problems  governed  by  the  Laplace  and  Poisson 
equations  in  the  plane.  The  final  section  develops  an  alternative  approach,  based  on  the 
idea  of  a  weak  solution  to  a  partial  differential  equation,  a  concept  of  independent  inter¬ 
est.  Indeed,  the  nonclassical  shock- wave  solutions  encountered  in  Section  2.3  are  properly 
characterized  as  weak  solutions. 

The  final  two  Chapters,  11  and  12,  survey  the  analysis  of  partial  differential  equations 
in,  respectively,  two  and  three  space  dimensions,  concentrating,  as  before,  on  the  Laplace, 
heat,  and  wave  equations.  Much  of  the  analysis  relies  on  separation  of  variables,  which,  in 
curvilinear  coordinates,  leads  to  new  classes  of  special  functions  that  arise  as  solutions  to 
certain  linear  second-order  non-constant-coefficient  ordinary  differential  equations.  Since 
we  are  not  assuming  familiarity  with  this  subject,  the  method  of  power  series  solutions  to 
ordinary  differential  equations  is  developed  in  some  detail.  We  also  present  the  methods 
of  Green’s  functions  and  fundamental  solutions,  including  their  qualitative  properties  and 
various  applications.  The  material  has  been  arranged  according  to  spatial  dimension  rather 
than  equation  type;  thus  Chapter  11  deals  with  the  planar  heat  and  wave  equations  (the 
planar  Laplace  and  Poisson  equations  having  been  treated  earlier,  in  Chapters  4  and  6), 
while  Chapter  12  covers  all  their  three-dimensional  counterparts.  This  arrangement  allows 
a  more  orderly  treatment  of  the  required  classes  of  special  functions;  thus,  Bessel  functions 
play  the  leading  role  in  Chapter  11,  while  spherical  harmonics,  Legendre/Ferrers  functions, 
and  Laguerre  polynomials  star  in  Chapter  12.  The  last  chapter  also  presents  the  Kirchhoff 
formula  that  solves  the  wave  equation  in  three-dimensional  space,  an  important  conse¬ 
quence  being  the  validity  of  Huygens’  Principle  concerning  the  localization  of  disturbances 
in  space,  which,  surprisingly,  does  not  hold  in  a  two-dimensional  universe.  The  book  cul¬ 
minates  with  an  analysis  of  the  Schrodinger  equation  for  the  hydrogen  atom,  whose  bound 
states  are  the  atomic  energy  levels  underlying  the  periodic  table,  atomic  spectroscopy,  and 
molecular  chemistry. 
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Course  Outlines  and  Chapter  Dependencies 


With  sufficient  planning  and  a  suitably  prepared  and  engaged  class,  most  of  the  material 

in  the  text  can  be  covered  in  a  year.  The  typical  single-semester  course  will  finish  with 

Chapter  6.  Some  pedagogical  suggestions: 

Chapter  1:  Go  through  quickly,  the  main  take-away  being  linearity  and  superposition. 

Chapter  2:  Most  is  worth  covering  and  needed  later,  although  Section  2.3,  on  shock  waves, 

is  optional,  or  can  be  deferred  until  later  in  the  course. 

Chapter  3:  Students  that  have  already  taken  a  basic  course  in  Fourier  analysis  can  move 

directly  ahead  to  the  next  chapter.  The  last  section,  on  convergence,  is 
important,  but  could  be  shortened  or  omitted  in  a  more  applied  course. 

Chapter  4:  The  heart  of  the  first  semester’s  course.  Some  of  the  material  at  the  end  of 

Section  4.1  —  Robin  boundary  conditions  and  the  root  cellar  problem  —  is 
optional,  as  is  the  very  last  subsection,  on  characteristics. 

Chapter  5:  A  course  that  includes  numerics  (as  I  strongly  recommend)  should  start  with 

Section  5.1  and  then  cover  at  least  a  couple  of  the  following  sections,  the 
selection  depending  upon  the  interests  of  the  students  and  instructor. 

Chapter  6:  The  material  on  distributions  and  the  delta  function  is  important  for  a  student’s 

general  mathematical  education,  both  pure  and  applied,  and,  in  particular, 
for  their  role  in  the  design  of  Green’s  functions.  The  proof  of  Green’s  repre¬ 
sentation  formula  (6.107)  might  be  heavy  going  for  some,  and  can  be  omitted 
by  just  covering  the  preceding  less-rigorous  justification  of  the  logarithmic 
formula  for  the  free-space  Green’s  function. 


Chapter  7:  Sections  7.1  and  7.2  are  essential,  and  convolution  in  Section  7.3  is  also  impor¬ 
tant.  Section  7.4,  on  Hilbert  space  and  quantum  mechanics,  can  easily  be 
omitted. 


Chapter  8:  All  five  sections  are  more  or  less  independent  of  each  other  and,  except  for  the 

fundamental  solution  and  maximum  principle  for  the  heat  equation,  not  used 
subsequently.  Thus,  the  instructor  can  pick  and  choose  according  to  interest 
and  time  alotted. 


Chapter  9:  This  chapter  is  at  a  more  abstract  level  than  the  bulk  of  the  text,  and  can 

be  skipped  entirely  (referring  back  when  required),  although  if  one  intends 
to  cover  the  finite  element  method,  the  material  in  the  first  three  sections 
leading  to  minimization  principles  is  required.  Chapters  11  and  12  can,  if 
desired,  be  launched  into  straight  after  Chapter  8,  or  even  Chapter  7  plus 
the  material  on  the  heat  equation  in  Chapter  8. 

Chapter  10:  Again,  for  a  course  that  includes  numerics,  finite  elements  is  extremely  im¬ 
portant  and  well  worth  covering.  The  final  Section  10.4,  on  weak  solutions, 
is  optional,  particularly  the  revisiting  of  shock  waves,  although  if  this  was 
skipped  in  the  early  part  of  the  course,  now  might  be  a  good  time  to  revisit 
Section  2.3. 


Chapters  11  and  12:  These  constitute  another  essential  component  of  the  classical  partial 

differential  equations  course.  The  detour  into  series  solutions  of  ordinary 
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differential  equations  is  worth  following,  unless  this  is  done  elsewhere  in  the 
curriculum.  I  recommend  trying  to  cover  as  much  as  possible,  although  one 
may  well  run  out  of  time  before  reaching  the  end,  in  which  case,  consider 
omitting  the  end  of  Section  11.6,  on  Chladni  figures  and  nodal  curves,  Sec¬ 
tion  12.6,  on  Kirchhoff’s  formula  and  Huygens’  Principle,  and  Section  12.7, 
on  the  hydrogen  atom.  Of  course,  if  Chapter  6,  on  Green’s  functions,  and 
Section  8.1,  on  fundamental  solutions,  were  omitted,  those  aspects  will  also 
presumably  be  omitted  here;  even  if  they  were  covered,  there  is  not  a  com¬ 
pelling  reason  to  revisit  these  topics  in  higher  dimensions,  and  one  may  prefer 
to  jump  ahead  to  the  more  novel  material  appearing  in  the  final  sections. 

Exercises  and  Software 

Exercises  appear  at  the  end  of  almost  every  subsection,  and  come  in  a  variety  of  genres. 
Most  sets  start  with  some  straightforward  computational  problems  to  develop  and  reinforce 
the  principal  new  techniques  and  ideas.  Ability  to  solve  these  basic  problems  is  a  minimal 
requirement  for  successfully  assimilating  the  material.  More  advanced  exercises  appear 
later  on.  Some  are  routine,  but  others  involve  challenging  computations,  computer-based 
projects,  additional  practical  and  theoretical  developments,  etc.  Some  will  challenge  even 
the  most  advanced  reader.  A  number  of  straightforward  technical  proofs,  as  well  as  inter¬ 
esting  and  useful  extensions  of  the  material,  particularly  in  the  later  chapters,  have  been 
relegated  to  the  exercises  to  help  maintain  continuity  of  the  narrative. 

Don’t  be  afraid  to  assign  only  a  few  parts  of  a  multi-part  exercise.  I  have  found 
the  True/False  exercises  to  be  particularly  useful  for  testing  of  a  student’s  level  of  under¬ 
standing.  A  full  answer  is  not  merely  a  T  or  F,  but  must  include  a  detailed  explanation 
of  the  reason,  e.g.,  a  proof  or  a  counterexample,  or  a  reference  to  a  result  in  the  text. 
Many  computer  projects  are  included,  particularly  in  the  numerical  chapters,  where  they 
are  essential  for  learning  the  practical  techniques.  However,  computer-based  exercises  are 
not  tied  to  any  specific  choice  of  language  or  software;  in  my  own  course,  Matlab  is  the 
preferred  programming  platform.  Some  exercises  could  be  streamlined  or  enhanced  by  the 
use  of  computer  algebra  systems,  such  as  Mathematica  and  Maple,  but,  in  general,  I 
have  avoided  assuming  access  to  any  symbolic  software. 

As  a  rough  guide,  some  of  the  exercises  are  marked  with  special  signs: 

0  indicates  an  exercise  that  is  referred  to  in  the  body  of  the  text,  or  is  important  for 
further  development  or  applications  of  the  subject.  These  include  theoretical  details, 
omitted  proofs,  or  new  directions  of  importance. 

T  indicates  a  project  —  usually  a  longer  exercise  with  multiple  interdependent  parts. 

4b  indicates  an  exercise  that  requires  (or  at  least  strongly  recommends)  use  of  a  computer. 
The  student  could  be  asked  either  to  write  their  own  computer  code  in,  say,  Matlab, 
Maple,  or  Mathematica,  or  to  make  use  of  pre-existing  packages. 

X  =  4b  +  T  indicates  a  more  extensive  computer  project. 

Movies 

In  the  course  of  writing  this  book,  I  have  made  a  number  of  movies  to  illustrate  the 
dynamical  behavior  of  solutions  and  their  numerical  approximations.  I  have  found  that 


Preface 


xv 


they  are  an  extremely  effective  pedagogical  tool  and  strongly  recommend  showing  them 
in  the  classroom  with  appropriate  commentary  and  discussion.  They  are  an  ideal  medium 
for  fostering  a  student’s  deep  understanding  and  insight  into  the  phenomena  exhibited  by 
the  at  times  indigestible  analytical  formulas  —  much  better  than  the  individual  snapshots 
that  appear  in  the  figures  in  the  printed  book. 

While  it  is  clearly  impossible  to  include  the  movies  directly  in  the  printed  text,  the 
electronic  e-book  version  will  contain  direct  links.  In  addition,  I  have  posted  all  the  movies 
on  my  own  web  site,  along  with  the  Mathematica  code  used  to  generate  them: 

http :  / / www .  math .  umn .  edu/ ^olver/ mov .  html 
When  a  movie  is  available,  the  sign  [+j  appears  in  the  figure  caption. 


Conventions  and  Notation 

A  complete  list  of  symbols  employed  can  be  found  in  the  Symbol  Index  that  appears  at 
the  end  of  the  book. 

Equations  are  numbered  consecutively  within  chapters,  so  that,  for  example,  (3.12) 
refers  to  the  12th  equation  in  Chapter  3,  irrespecive  of  which  section  it  appears  in. 

Theorems,  lemmas,  propositions,  definitions,  and  examples  are  also  numbered  con¬ 
secutively  within  each  chapter,  using  a  single  scheme.  Thus,  in  Chapter  1,  Definition  1.2 
follows  Example  1.1,  and  precedes  Proposition  1.3  and  Theorem  1.4.  I  find  this  numbering 
system  to  be  the  most  helpful  for  speedy  navigation  through  the  book. 

References  (books,  papers,  etc.)  are  listed  alphabetically  at  the  end  of  the  text,  and 
are  referred  to  by  number.  Thus,  [89]  is  the  89th  listed  reference,  namely  my  Applied 
Linear  Algebra  text. 

Q.E.D.  signifies  the  end  of  a  proof,  an  acronym  for  “quod  erat  demonstrandum” ,  which 
is  Latin  for  “which  was  to  be  demonstrated” . 

The  variables  that  appear  throughout  will  be  subject  to  consistent  notational  conven¬ 
tions.  Thus  t  always  denotes  time,  while  x,  y,  z  represent  (Cartesian)  space  coordinates. 
Polar  coordinates  r,  0,  cylindrical  coordinates  r,  0,  z,  and  spherical  coordinates  r,  0,  (/?,  will 
also  be  used  when  needed,  and  our  conventions  appear  at  the  appropriate  places  in  the 
exposition;  be  especially  careful  with  the  last  case,  since  the  angular  variables  0,  p  are 
subject  to  two  contradictory  conventions  in  the  literature.  The  above  are  almost  always 
independent  variables  in  the  partial  differential  equations  under  study;  the  dependent  vari¬ 
ables  or  unknowns  will  mostly  be  denoted  by  u,v,w,  while  f,g,h  and  F,  G,  H  represent 
known  functions,  appearing  as  forcing  terms  or  in  boundary  data.  See  Chapter  4  for  our 
convention,  used  in  differential  geometry,  used  to  denote  functions  in  different  coordinate 
systems,  i.e.,  u(x,y)  versus  u(r,6). 

In  accordance  with  standard  contemporary  mathematical  notation,  the  “blackboard 
bold”  letter  M  denotes  the  real  number  line,  C  denotes  the  field  of  complex  numbers,  Z 
denotes  the  set  of  integers,  both  positive  and  negative,  while  N  denotes  the  natural  numbers, 
i.e.,  the  nonnegative  integers,  including  0.  Similarly,  IRn  and  Cn  denote  the  corresponding 
n-dimensional  real  and  complex  vector  spaces  consisting  of  n-tuples  of  elements  of  M  and 
C,  respectively.  The  zero  vector  in  each  is  denoted  by  0. 

Boldface  lowercase  letters,  e.g.,  v,x,a,  usually  denote  vectors  (almost  always  column 
vectors),  whose  entries  are  indicated  by  subscripts:  v1:xi:  etc.  Matrices  are  denoted  by 
ordinary  capital  letters,  e.g.,  A,  C,  K,  M  —  but  not  all  such  letters  refer  to  matrices;  for 
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instance,  V  often  refers  to  a  vector  space,  while  F  is  typically  a  forcing  function.  The  entries 
of  a  matrix,  say  A ,  are  indicated  by  the  corresponding  subscripted  lowercase  letters:  a-, 
with  i  the  row  index  and  j  the  column  index. 


Angles  are  always  measured  in  radians,  although  occasionally  degrees  will  be  men¬ 
tioned  in  descriptive  sentences.  All  trigonometric  functions  are  evaluated  on  radian  angles. 
Following  the  conventions  advocated  in  [85,86],  we  use  phz  to  denote  the  phase  of  a 
complex  number  z  G  C,  which  is  more  commonly  called  the  argument  and  denoted  by 
arg  2.  Among  the  many  reasons  to  prefer  “phase”  are  to  avoid  potential  confusion  with 
the  argument  x  of  a  function  /(#),  as  well  as  to  be  in  accordance  with  the  “Method  of 
Stationary  Phase”  mentioned  in  Chapter  8. 

We  use  {  f  \  C}  to  denote  a  set,  where  /  gives  the  formula  for  the  members  of  the 
set  and  C  is  a  (possibly  empty)  list  of  conditions.  For  example,  {x|0<x<l}  means 
the  closed  unit  interval  from  0  to  1,  also  written  [0,1],  while  {ax2  +  bx  -\-  c  |  a,  6,  c  G  IR  } 
is  the  set  of  real  quadratic  polynomials,  and  {0}  is  the  set  consisting  only  of  the  number 
0.  We  use  x  G  S  to  indicate  that  x  is  an  element  of  the  set  S',  while  y  ^  S  says  that  y 
is  not  an  element.  Set  theoretic  union  and  intersection  are  denoted  by  S  U  T  and  SflT, 
respectively.  The  subset  sign  S  C  U  includes  the  possibility  that  the  sets  S  and  U  might 
be  equal,  although  for  emphasis  we  sometimes  write  S  C  U.  On  the  other  hand,  S  C  U 
specifically  implies  that  the  two  sets  are  not  equal.  We  use  U  \  S  =  {x  \  x  G  U,  x  ^  S}  to 
denote  the  set-theoretic  difference,  meaning  all  elements  of  U  that  do  not  belong  to  S.  We 
use  the  abbreviations  max  and  min  to  denote  the  maximum  and  minimum  elements  of  a 
set  of  real  numbers,  or  of  a  real- valued  function. 


The  symbol  =  is  used  to  emphasize  when  two  functions  are  identically  equal,  so  f(x)  = 
1  means  that  /  is  the  constant  function,  equal  to  1  at  all  values  of  x.  It  is  also  occasionally 
used  in  modular  arithmetic,  whereby  i  =  j  modn  means  i—j  is  divisible  by  n.  The  symbol 
:=  will  define  a  quantity,  e.g.,  f(x)  :=  x2  —  1.  An  arrow  is  used  in  two  senses:  first,  to 
indicate  convergence  of  a  sequence,  e.g.,  xn  -G  x*  as  n  -G  oo,  or,  alternatively,  to  indicate 
a  function,  so  f:X  -G  Y  means  that  the  function  /  maps  the  domain  set  X  to  the  image 
or  target  set  T,  with  formula  y  =  f(x).  Composition  of  functions  is  denoted  by  /  °g,  while 
f~1  indicates  the  inverse  function.  Similarly,  A~1  denotes  the  inverse  of  a  matrix  A. 


By  an  elementary  function  we  mean  a  combination  of  rational,  algebraic,  trigono¬ 
metric,  exponential,  logarithmic,  and  hyperbolic  functions.  Familiarity  with  their  basic 
properties  is  assumed.  We  always  use  log  x  for  the  natural  (base  e)  logarithm  —  avoiding 
the  ugly  modern  notation  lux.  On  the  other  hand,  the  required  properties  of  the  various 
special  functions  —  the  error  and  complementary  error  functions,  the  gamma  function,  Airy 
functions,  Bessel  and  spherical  Bessel  functions,  Legendre  and  Ferrers  functions,  Laguerre 
functions,  spherical  harmonics,  etc.  —  will  be  developed  as  needed. 

n 

Summation  notation  is  used  throughout,  so  ai  denotes  the  finite  sum  ax  +  a2  + 

i  —  i 

•  •  •  +  an  or,  if  the  upper  limit  is  n  =  oo,  an  infinite  series.  Of  course,  the  lower  limit  need 
not  be  1;  if  it  is  —oo  and  the  upper  limit  is  +oo,  the  result  is  a  doubly  infinite  series, 
e.g.,  the  complex  Fourier  series  in  Chapter  3.  We  use  lim  an  to  denote  the  usual  limit 

n  — »  oo 

of  a  sequence  a  .  Similarly,  lim  f(x)  denotes  the  limit  of  the  function  f(x)  at  a  point  a, 
while  f{x~)  =  lim  f{x)  and  /(x+)  =  lim  f{x)  are  the  one-sided  (left-  and  right-hand, 

x — » a~  x — > a+ 

respectively)  limits,  which  agree  if  and  only  if  lim  f(x)  exists. 

x  — >  a 

We  will  employ  a  variety  of  standard  notations  for  derivatives.  In  the  case  of  ordinary 
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du 

derivatives,  the  most  basic  is  the  Leibniz  notation  —  for  the  derivative  of  u  with  respect  to 


du  du  d2u  d 3 


u 


and  the 


dx 

x.  As  for  partial  derivatives,  both  the  full  Lebiniz  notation  ,  ,  , 

dt  dx  dxz  dt  dx 1 

more  compact  subscript  notation  ut,ux,uxx,utxx,  etc.  will  be  interchangeably  employed 

throughout;  see  also  Chapter  1.  Unless  specifically  mentioned,  all  functions  are  assumed  to 

be  sufficiently  smooth  that  any  indicated  derivatives  exist  and  the  relevant  mixed  partial 

derivatives  are  equal.  Ordinary  derivatives  can  also  be  indicated  by  the  Newtonian  notation 

du  d 2  u  d ^  u 

u!  instead  of  —  and  u "  for  — while  u ^  denotes  the  nth  order  derivative  — — .  If  the 

dx  dxz  _  __  dxn 

variable  is  time,  t,  instead  of  space,  x,  then  we  may  employ  dots,  it,  12,  instead  of  primes. 

Definite  integrals  are  denoted  by  J  f{x)  dx ,  while  J  f{x)  dx  is  the  corresponding 

indefinite  integral  or  anti-derivative.  We  assume  familiarity  only  with  the  Riemann  theory 
of  integration,  although  students  who  have  learned  Lebesgue  integration  may  wish  to  take 
advantage  of  that  on  occasion,  e.g.,  during  the  discussion  of  Hilbert  space. 


Historical  Matters 


Mathematics  is  both  a  historical  and  a  social  activity,  and  many  notable  algorithms,  the¬ 
orems,  and  formulas  are  named  after  famous  (and,  on  occasion,  not-so-famous)  mathe¬ 
maticians,  scientists,  and  engineers  —  usually,  but  not  necessarily,  the  discoverer (s).  The 
text  includes  a  succinct  description  of  many  of  the  named  contributors.  Readers  who  are 
interested  in  more  extensive  historical  details,  complete  biographies,  and,  when  available, 
portraits  or  photos,  are  urged  to  consult  the  informative  University  of  St.  Andrews  Mac- 
tutor  web  site: 

http : / / www-history . mcs . st-andrews . ac . uk/history/ index . html 

Early  prominent  contributors  to  the  subject  include  the  Bernoulli  family,  Euler,  d’Alembert, 
Lagrange,  Laplace,  and,  particularly,  Fourier,  whose  remarkable  methods  in  part  sparked 
the  nineteenth  century’s  rigorization  of  mathematical  analysis  and  then  mathematics  in 
general,  as  pursued  by  Cauchy,  Riemann,  Cantor,  Weierstrass,  and  Hilbert.  In  the  twen¬ 
tieth  century,  the  subject  of  partial  differential  equations  reached  maturity,  producing  an 
ever-increasing  number  of  research  papers,  both  theoretical  and  applied.  Nevertheless,  it 
remains  one  of  the  most  challenging  and  active  areas  of  mathematical  research,  and,  in 
some  sense,  we  have  only  scratched  the  surface  of  this  deep  and  fascinating  subject. 

Textbooks  devoted  to  partial  differential  equations  began  to  appear  long  ago.  Of  par¬ 
ticular  note,  Courant  and  Hilbert’s  monumental  two- volume  treatise,  [34,35],  played  a 
central  role  in  the  development  of  applied  mathematics  in  general,  and  partial  differen¬ 
tial  equations  in  particular.  Indeed,  it  is  not  an  exaggeration  to  state  that  all  modern 
treatments,  including  this  one,  as  well  as  large  swaths  of  research,  have  been  directly  influ¬ 
enced  by  this  magnificent  text.  Modern  undergraduate  textbooks  worth  consulting  include 
[50,91,92,  114,  120],  which  are  more  or  less  at  the  same  mathematical  level  but  have  a  va¬ 
riety  of  points  of  view  and  selection  of  topics.  The  graduate-level  texts  [38,  44,  61,  70,  99] 
are  recommended  starting  points  for  the  more  advanced  reader  and  beginning  researcher. 
More  specialized  monographs  and  papers  will  be  referred  to  at  the  appropriate  junctures. 

This  book  began  life  in  1999  as  a  part  of  a  planned  comprehensive  introduction  to 
applied  math,  inspired  in  large  part  by  Gilbert  Strang’s  wonderful  text,  [112].  After  some 
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time  and  much  effort,  it  was  realized  that  the  original  vision  was  much  too  ambitious  a 
goal,  so  my  wife,  Cheri  Shakiban,  and  I  recast  the  first  part  as  our  applied  linear  algebra 
textbook,  [89],  I  later  decided  that  a  large  fraction  of  the  remainder  could  be  reworked 
into  an  introduction  to  partial  differential  equations,  which,  after  some  time  and  classroom 
testing,  resulted  in  the  book  you  are  now  reading. 

Some  Final  Remarks 

To  the  student :  You  are  about  to  delve  into  the  vast  and  important  field  of  partial 
differential  equations.  I  hope  you  enjoy  the  experience  and  profit  from  it  in  your  future 
studies  and  career,  wherever  they  may  take  you.  Please  send  me  your  comments.  Did  you 
find  the  explanations  helpful  or  confusing?  Were  enough  examples  included?  Were  the 
exercises  of  sufficient  variety  and  appropriate  level  to  enable  you  to  learn  the  material?  Do 
you  have  suggestions  for  improvements  to  be  incorporated  into  a  new  edition? 

To  the  instructor :  Thank  you  for  adopting  this  text!  I  hope  you  enjoy  teaching  from 
it  as  much  as  I  enjoyed  writing  it.  Whatever  your  experience,  I  want  to  hear  from  you.  Let 
me  know  which  parts  you  liked  and  which  you  didn’t.  Which  sections  worked  and  which 
were  less  successful.  Which  parts  your  students  enjoyed,  which  parts  they  struggled  with, 
and  which  parts  they  disliked.  How  can  it  be  improved? 

To  all  readers :  Like  every  author,  I  sincerely  hope  that  I  have  eliminated  all  errors  in 
the  text.  But,  more  realistically,  I  know  that  no  matter  how  many  times  one  proofreads, 
mistakes  still  manage  to  squeeze  through  (or,  worse,  be  generated  during  the  editing  pro¬ 
cess).  Please  email  me  your  questions,  typos,  mathematical  errors,  comments,  suggestions, 
and  so  on.  The  book’s  dedicated  web  site 

http :  / / www .  math .  umn .  edu/ ^olver / pde  .  html 

will  actively  maintain  a  comprehensive  list  of  known  corrections,  commentary,  feedback, 
and  resources,  as  well  as  links  to  the  movies  and  Mathematic  A  code  mentioned  above. 
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Chapter  1 

What  Are  Partial  Differential  Equations? 


Let  us  begin  by  delineating 
relates  the  derivatives  of  a 
example, 


is  a  differential  equation  for 


our  field  of  study.  A  differential  equation  is  an  equation  that 
(scalar)  function  depending  on  one  or  more  variables.  For 


dAu  d2u 
dx 4  dx 2 

the  function  u{x) 


u 2  =  cos  x  (1.1) 

depending  on  a  single  variable  x,  while 


du  d2u  d2u 

dt  dx 2  dy 2  U 

is  a  differential  equation  involving  a  function  u(t,  x,  y )  of  three  variables. 

A  differential  equation  is  called  ordinary  if  the  function  u  depends  on  only  a  single 
variable,  and  partial  if  it  depends  on  more  than  one  variable.  Usually  (but  not  quite  always) 
the  dependence  of  u  can  be  inferred  from  the  derivatives  that  appear  in  the  differential 
equation.  The  order  of  a  differential  equation  is  that  of  the  highest-order  derivative  that 
appears  in  the  equation.  Thus,  (1.1)  is  a  fourth-order  ordinary  differential  equation,  while 
(1.2)  is  a  second-order  partial  differential  equation. 


Remark :  A  differential  equation  has  order  0  if  it  contains  no  derivatives  of  the  function 
u.  These  are  more  properly  treated  as  algebraic  equations  f  which,  while  of  great  interest 
in  their  own  right,  are  not  the  subject  of  this  text.  To  be  a  bona  fide  differential  equation , 
it  must  contain  at  least  one  derivative  of  u,  and  hence  have  order  >  1. 


There  are  two  common  notations  for  partial  derivatives,  and  we  shall  employ  them 
interchangeably.  The  first,  used  in  (1.1)  and  (1.2),  is  the  familiar  Leibniz  notation  that 
employs  a  d  to  denote  ordinary  derivatives  of  functions  of  a  single  variable,  and  the  d 
symbol  (usually  also  pronounced  “dee”)  for  partial  derivatives  of  functions  of  more  than 
one  variable.  An  alternative,  more  compact  notation  employs  subscripts  to  indicate  par¬ 
tial  derivatives.  For  example,  ut  represents  du/dt ,  while  uxx  is  used  for  d2u/dx 2,  and 
d3u/dx2dy  for  uxxy.  Thus,  in  subscript  notation,  the  partial  differential  equation  (1.2)  is 
written 

Ut  =  uxx  +  u  -u.  (1.3) 


t  Here,  the  term  “algebraic  equation”  is  used  only  to  distinguish  such  equations  from  true 
“differential  equations”.  It  does  not  mean  that  the  defining  functions  are  necessarily  algebraic, 
e.g.,  polynomials.  For  example,  the  transcendental  equation  tan  it  =  it,  which  appears  later  in 
(4.50),  is  still  regarded  as  an  algebraic  equation  in  this  book. 
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We  will  similarly  abbreviate  partial  differential  operators,  sometimes  writing  d/dx  as  dx, 
while  d2 /dx2  can  be  written  as  either  d2  or  dxxl  and  d3 /dx2dy  becomes  dxxy  =  d2  d  . 

It  is  worth  pointing  out  that  the  preponderance  of  differential  equations  arising  in 
applications,  in  science,  in  engineering,  and  within  mathematics  itself  are  of  either  first 
or  second  order,  with  the  latter  being  by  far  the  most  prevalent.  Third-order  equations 
arise  when  modeling  waves  in  dispersive  media,  e.g.,  water  waves  or  plasma  waves.  Fourth- 
order  equations  show  up  in  elasticity,  particularly  plate  and  beam  mechanics,  and  in  image 
processing.  Equations  of  order  >  5  are  very  rare. 

A  basic  prerequisite  for  studying  this  text  is  the  ability  to  solve  simple  ordinary  differ¬ 
ential  equations:  first-order  equations;  linear  constant-coefficient  equations,  both  homoge¬ 
neous  and  inhomogeneous;  and  linear  systems.  In  addition,  we  shall  assume  some  familiar¬ 
ity  with  the  basic  theorems  concerning  the  existence  and  uniqueness  of  solutions  to  initial 
value  problems.  There  are  many  good  introductory  texts,  including  [18,20,23].  More 
advanced  treatises  include  [31,  52,  54,  59].  Partial  differential  equations  are  considerably 
more  demanding,  and  can  challenge  the  analytical  skills  of  even  the  most  accomplished 
mathematician.  Many  of  the  most  effective  solution  strategies  rely  on  reducing  the  partial 
differential  equation  to  one  or  more  ordinary  differential  equations.  Thus,  in  the  course  of 
our  study  of  partial  differential  equations,  we  will  need  to  develop,  ab  initio,  some  of  the 
more  advanced  aspects  of  the  theory  of  ordinary  differential  equations,  including  boundary 
value  problems,  eigenvalue  problems,  series  solutions,  singular  points,  and  special  functions. 

Following  the  introductory  remarks  in  the  present  chapter,  the  exposition  begins  in 
earnest  with  simple  first-order  equations,  concentrating  on  those  that  arise  as  models  of 
wave  phenomena.  Most  of  the  remainder  of  the  text  will  be  devoted  to  understanding  and 
solving  the  three  essential  linear  second-order  partial  differential  equations  in  one,  two, 
and  three  space  dimensions:’*'  the  heat  equation ,  modeling  thermodynamics  in  a  continuous 
medium,  as  well  as  diffusion  of  animal  populations  and  chemical  pollutants;  the  wave 
equation ,  modeling  vibrations  of  bars,  strings,  plates,  and  solid  bodies,  as  well  as  acoustic, 
fluid,  and  electromagnetic  vibrations;  and  the  Laplace  equation  and  its  inhomogeneous 
counterpart,  the  Poisson  equation ,  governing  the  mechanical  and  thermal  equilibria  of 
bodies,  as  well  as  fluid-mechanical  and  electromagnetic  potentials. 

Each  increase  in  dimension  requires  an  increase  in  mathematical  sophistication,  as 
well  as  the  development  of  additional  analytic  tools  —  although  the  key  ideas  will  have 
all  appeared  once  we  reach  our  physical,  three-dimensional  universe.  The  three  starring 
examples  —  heat,  wave,  and  Laplace/Poisson  —  are  not  only  essential  to  a  wide  range 
of  applications,  but  also  serve  as  instructive  paradigms  for  the  three  principal  classes  of 
linear  partial  differential  equations  —  parabolic,  hyperbolic,  and  elliptic.  Some  interesting 
nonlinear  partial  differential  equations,  including  first-order  transport  equations  modeling 
shock  waves,  the  second-order  Burgers’  equation  governing  simple  nonlinear  diffusion  pro¬ 
cesses,  and  the  third-order  Korteweg-de  Vries  equation  governing  dispersive  waves,  will 
also  be  discussed.  But,  in  such  an  introductory  text,  the  further  reaches  of  the  vast  realm 
of  nonlinear  partial  differential  equations  must  remain  unexplored,  awaiting  the  reader’s 
more  advanced  mathematical  excursions. 

More  generally,  a  system  of  differential  equations  is  a  collection  of  one  or  more  equa¬ 
tions  relating  the  derivatives  of  one  or  more  functions.  It  is  essential  that  all  the  functions 


For  us,  dimension  always  refers  to  the  number  of  space  dimensions.  Time,  although  theoreti¬ 
cally  also  a  dimension,  plays  a  very  different  physical  role,  and  therefore  (at  least  in  nonrelativist ic 
systems)  is  to  be  treated  on  a  separate  footing. 
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occurring  in  the  system  depend  on  the  same  set  of  variables.  The  symbols  representing 
these  functions  are  known  as  the  dependent  variables ,  while  the  variables  that  they  depend 
on  are  called  the  independent  variables.  Systems  of  differential  equations  are  called  ordi¬ 
nary  or  partial  according  to  whether  there  are  one  or  more  independent  variables.  The 
order  of  the  system  is  the  highest-order  derivative  occurring  in  any  of  its  equations. 

For  example,  the  three-dimensional  Navier-Stokes  equations 


du  du  du  du 
+  U  — - b  V  TZ - b  w 


dt 

dv 


dx 


dy 


dt 
dw 


dv  dv 
+  u  — — b  v  — — b  vo 


dx 


dy 


dw  dw 

wrr  +  u  "X - 1-  v  "x - b  w 

dt  dx  dy 

du  dv 
~b  w  “b 


dz 

dv 

dz 

dw 

dz 

dw 


dx  dy  dz 


0 


is  a  second-order  system  of  differential  equations  that  involves  four  functions,  u(t,  x,  y,  z), 
v(t,  x,y,  z),  w(t,  x,  y,  z),  p(t,x,y,  z),  each  depending  on  four  variables,  while  v  >  0  is  a 
fixed  constant.  (The  function  p  necessarily  depends  on  t,  even  though  no  t  derivative  of 
it  appears  in  the  system.)  The  independent  variables  are  £,  representing  time,  and  x,  y,  z, 
representing  space  coordinates.  The  dependent  variables  are  u,v,w,p,  with  v  =  (u,v,w) 
representing  the  velocity  vector  field  of  an  incompressible  fluid  flow,  e.g.,  water,  and  p  the 
accompanying  pressure.  The  parameter  v  measures  the  viscosity  of  the  fluid.  The  Navier- 
Stokes  equations  are  fundamental  in  fluid  mechanics,  [12],  and  are  notoriously  difficult  to 
solve,  either  analytically  or  numerically.  Indeed,  establishing  the  existence  or  nonexistence 
of  solutions  for  all  future  times  remains  a  major  unsolved  problem  in  mathematics,  whose 
resolution  will  earn  you  a  $1,000,000  prize;  see  http :  //www.  claymath.org  for  details.  The 
Navier-Stokes  equations  first  appeared  in  the  early  1800s  in  works  of  the  French  applied 
mathematician/engineer  Claude-Louis  Navier  and,  later,  the  British  applied  mathemati¬ 
cian  George  Stokes,  whom  you  already  know  from  his  eponymous  multivariable  calculus 
theorem. i  The  inviscid  case,  v  —  0,  is  known  as  the  Euler  equations  in  honor  of  their  dis¬ 
coverer,  the  incomparably  influential  eighteenth-century  Swiss  mathematician  Leonhard 
Euler. 

We  shall  be  employing  a  few  basic  notational  conventions  regarding  the  variables  that 
appear  in  our  differential  equations.  We  always  use  t  to  denote  time,  while  x,  y,  z  will  rep¬ 
resent  (Cartesian)  space  coordinates.  Polar  coordinates  r,  0,  cylindrical  coordinates  r,  0,  z, 
and  spherical  coordinates* *  r,  0,  (/?,  will  also  be  used  when  needed.  An  equilibrium  equation 
models  an  unchanging  physical  system,  and  so  involves  only  the  space  variable(s).  The 
time  variable  appears  when  modeling  dynamical ,  meaning  time-varying,  processes.  Both 
time  and  space  coordinates  are  (usually)  independent  variables.  The  dependent  variables 
will  mostly  be  denoted  by  u,v,w,  although  occasionally  —  particularly  in  representing 


*  Interestingly,  Stokes’  Theorem  was  taken  from  an  1850  letter  that  Lord  Kelvin  wrote  to 
Stokes,  who  turned  it  into  an  undergraduate  exam  question  for  the  Smith  Prize  at  Cambridge 
University  in  England.  However,  unbeknownst  to  either,  the  result  had,  in  fact,  been  discovered 
earlier  by  George  Green,  the  father  of  Green’s  Theorem  and  also  the  Green’s  function,  which  will 
be  the  subject  of  Chapter  6. 

*  See  Section  12.2  for  our  notational  convention. 
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particular  physical  quantities  —  other  letters  may  be  employed,  e.g.,  the  pressure  p  in 
(1.4).  On  the  other  hand,  the  letters  /,  g,/i  typically  represent  specified  functions  of  the 
independent  variables,  e.g.,  forcing  or  boundary  or  initial  conditions. 

In  this  introductory  text,  we  must  confine  onr  attention  to  the  most  basic  analytic 
and  numerical  solution  techniques  for  a  select  few  of  the  most  important  partial  differential 
equations.  More  advanced  topics,  including  all  systems  of  partial  differential  equations, 
must  be  deferred  to  graduate  and  research-level  texts,  e.g.,  [35,38,44,61,99].  In  fact, 
many  important  issues  remain  incompletely  resolved  and/or  poorly  understood,  making 
partial  differential  equations  one  of  the  most  active  and  exciting  fields  of  contemporary 
mathematical  research.  One  of  my  goals  is  that,  by  reading  this  book,  you  will  be  both 
inspired  and  equipped  to  venture  much  further  into  this  fascinating  and  essential  area  of 
mathematics  and/or  its  remarkable  range  of  applications  throughout  science,  engineering, 
economics,  biology,  and  beyond. 


Exercises 


1.1.  Classify  each  of  the  following  differential  equations  as  ordinary  or  partial,  and  equilibrium 

du  du  du 

or  dynamic;  then  write  down  its  order,  (a)  — — b  xu  =  1,  (b)  —  +  u  —  =  x, 


(c)  utt  =  9u  ,  (d) 
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d2u 


du  d2u  du 
dt  dx 2  dx 


dx 
>  (e)  “ 
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dx 2  dy 2 


dt 

2  ,  : 

=  x  +  y 


dx 


(f)  ~j~2  +3 u  =  sin  t,  (g)  uxx  +  uyy  +  uzz  +  {x  +  y  +  z  )u  =  0,  (h)  uxx  =  x  +  u 
du  d3u  du  _  d2u  d2u  /7  . 

(J)  Qj.  dx 3  U  oC  0)  oTTAT  utt  uxxxx  u 


dx 


dx2  dy  dz 
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yyyy 


1.2.  In  two  space  dimensions,  the  Laplacian  is  defined  as  the  second-order  partial  differential 
operator  A  =  d2  +  d2  Write  out  the  following  partial  differential  equations  in  (i)  Leibniz 

t7 

notation;  (ii)  subscript  notation:  (a)  the  Laplace  equation  A u  =  0;  (b)  the  Poisson  equa¬ 
tion  —  A u  =  /;  (c)  the  two-dimensional  heat  equation  dtu  =  Aw;  (d)  the  von  Karman 

plate  equation  A  u  =  0. 

O  r\  r\ 

1.3.  Answer  Exercise  1.2  for  the  three-dimensional  Laplacian  A  =  dx  +  dy+d^z. 

1.4.  Identify  the  independent  variables,  the  dependent  variables,  and  the  order  of  the  following 

du  dv  du  du 

systems  of  partial  differential  equations:  (a) 


(b)  uxx  +  vvv  =  cos (x  +  y),  uxvv  -  uvvx  =  1;  (c) 


dx  dy  dy 
du  dv 


dx  ’ 

d2v  d2u 


yy  ~x~y  ~y~x  ^  v~y  dt  dx  5  dt2  dx2 

(d)  ut  +  uux  +  VUV  =  px,  vt  +  uvx  +  vvy=  pv,  ux  +  vv  =  0; 


(e)  u 


y 
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rr* 

tC  iC  tC 


-bn(l-n),  vt 
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xxy 


+  VW,  Wf  =  Ux  +  Vy. 


Classical  Solutions 

Let  us  now  focus  our  attention  on  a  single  differential  equation  involving  a  single,  scalar¬ 
valued  function  u  that  depends  on  one  or  more  independent  variables.  The  function  u 
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is  usually  real-valued,  although  complex-valued  functions  can,  and  do,  play  a  role  in  the 
analysis.  Everything  that  we  say  in  this  section  will,  when  suitably  adapted,  apply  to 
systems  of  differential  equations. 

By  a  solution  we  mean  a  sufficiently  smooth  function  u  of  the  independent  variables 
that  satisfies  the  differential  equation  at  every  point  of  its  domain  of  definition.  We  do  not 
necessarily  require  that  the  solution  be  defined  for  all  possible  values  of  the  independent 
variables.  Indeed,  usually  the  differential  equation  is  imposed  on  some  domain  D  contained 
in  the  space  of  independent  variables,  and  we  seek  a  solution  defined  only  on  D.  In  general, 
the  domain  D  will  be  an  open  subset,  usually  connected  and,  particularly  in  equilibrium 
equations,  often  bounded,  with  a  reasonably  nice  boundary,  denoted  by  dD. 

We  will  call  a  function  smooth  if  it  can  be  differentiated  sufficiently  often,  at  least 
so  that  all  of  the  derivatives  appearing  in  the  equation  are  well  defined  on  the  domain 
of  interest  D.  More  specifically,  if  the  differential  equation  has  order  n,  then  we  require 
that  the  solution  u  be  of  class  Cn,  which  means  that  it  and  all  its  derivatives  of  order 
<  n  are  continuous  functions  in  D,  and  such  that  the  differential  equation  that  relates  the 
derivatives  of  u  holds  throughout  D.  However,  on  occasion,  e.g.,  when  dealing  with  shock 
waves,  we  will  consider  more  general  types  of  solutions.  The  most  important  such  class 
consists  of  the  so-called  “weak  solutions”  to  be  introduced  in  Section  10.4.  To  emphasize 
the  distinction,  the  smooth  solutions  described  above  are  often  referred  to  as  classical 
solutions.  In  this  book,  the  term  “solution”  without  extra  qualification  will  usually  mean 
“classical  solution” . 


Example  1.1.  A  classical  solution  to  the  heat  equation 

du  d2u 
dt  dx 2 

is  a  function  u(t,x),  defined  on  a  domain  D  Cl2,  such  that  all  of  the  functions 

d2u 


(1.5) 


.  N  du  du  d2u 

u(t,  x),  —(t,x),  —(t,x),  — -  (£,  X) 


d2u  d2u 

(£,  x)  =  — — —  (£,  x) 


(t,x), 


dt  v  7  /7  dxK  1  J1  dt2  dtdx  dxdt  dx2 

are  well  defined  and  continuous’* *'  at  every  point  (£,  x)  E  D,  so  that  u  E  C2(H),  and, 
moreover,  (1.5)  holds  at  every  (t,x)  E  D.  Observe  that,  even  though  only  ut  and  uxx 
explicitly  appear  in  the  heat  equation,  we  require  continuity  of  all  the  partial  derivatives 
of  order  <  2  in  order  that  u  qualify  as  a  classical  solution.  For  example, 

u(t,  x)=t-\-^x2  (1.6) 

is  a  solution  to  the  heat  equation  that  is  defined  on  the  full  domain  D  =  M2  because  it  is* 
C2,  and,  moreover, 

du  d2u 

dt  dx2 

Another,  more  complicated  but  extremely  important,  solution  is 

e-x2/(4 1) 


u(t ,  x) 


2  y/irt 


(1.7) 


*  The  equality  of  the  mixed  partial  derivatives  follows  from  a  general  theorem  in  multivariable 
calculus,  [8,  97, 108]  .  Classical  solutions  automatically  enjoy  equality  of  all  their  relevant  mixed 
partial  derivatives. 

*  In  fact,  the  function  (1.6)  is  C°°,  meaning  infinitely  differentiable,  on  all  of  R2. 
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One  easily  verifies  that  u  E  C2  and,  moreover,  solves  the  heat  equation  on  the  domain 
D  —  {  (t,x)  1 1  >  0  }  C  IR2.  The  reader  is  invited  to  verify  this  by  computing  du/dt  and 
d2u/dx 2,  and  then  checking  that  they  are  equal.  Finally,  with  i  =  denoting  the 

imaginary  unit,  we  note  that 


u{t,x)  =  e  t+lx  =  e  tcosx-\-ie  *  sin  ay  (1.8) 

the  second  expression  following  from  Euler’s  formula  (A.  11),  defines  a  complex- valued 
solution  to  the  heat  equation.  This  can  be  verified  directly,  since  the  rules  for  differentiating 
complex  exponentials  are  identical  to  those  for  their  real  counterparts: 


du 

dt 


e 


—  £+  i  x 


du 

dx 


i  e 


—  £+  i  x 


and  so 


d2u 

dx2 


e~t+  ix 


du 

dt' 


It  is  worth  pointing  out  that  both  the  real  part,  e~l  cosay  and  the  imaginary  part,  e~t  sin  ay 
of  the  complex  solution  (1.8)  are  individual  real  solutions,  which  is  indicative  of  a  fairly 
general  property. 


Incidentally,  most  partial  differential  equations  arising  in  physical  applications  are  real, 
and,  although  complex  solutions  often  facilitate  their  analysis,  at  the  end  of  the  day  we 
require  real,  physically  meaningful  solutions.  A  notable  exception  is  quantum  mechanics, 
which  is  an  inherently  complex- valued  physical  theory.  For  example,  the  one-dimensional 
Schrodinger  equation 


h2  d2u 
2  m  dx2 


+  V (x)  n, 


with  h  denoting  Planck’s  constant ,  which  is  real,  governs  the  dynamical  evolution  of  the 
complex- valued  wave  function  u(t,  x)  describing  the  probabilistic  distribution  of  a  quantum 
particle  of  mass  m,  e.g.,  an  electron,  moving  in  the  force  field  prescribed  by  the  (real) 
potential  function  V(x).  While  the  solution  u  is  complex- valued,  the  independent  variables 
t,  x,  representing  time  and  space,  remain  real. 


Initial  Conditions  and  Boundary  Conditions 

How  many  solutions  does  a  partial  differential  equation  have?  In  general,  lots.  Even 
ordinary  differential  equations  have  infinitely  many  solutions.  Indeed,  the  general  solution 
to  a  single  nth  order  ordinary  differential  equation  depends  on  n  arbitrary  constants.  The 
solutions  to  partial  differential  equations  are  yet  more  numerous,  in  that  they  depend 
on  arbitrary  functions.  Very  roughly,  we  can  expect  the  solution  to  an  nth  order  partial 
differential  equation  involving  m  independent  variables  to  depend  on  n  arbitrary  functions 
of  m  —  1  variables.  But  this  must  be  taken  with  a  large  grain  of  salt  —  only  in  a  few  special 
instances  will  we  actually  be  able  to  express  the  solution  in  terms  of  arbitrary  functions. 

The  solutions  to  dynamical  ordinary  differential  equations  are  singled  out  by  the  im¬ 
position  of  initial  conditions,  resulting  in  an  initial  value  problem.  On  the  other  hand, 
equations  modeling  equilibrium  phenomena  require  boundary  conditions  to  specify  their 
solutions  uniquely,  resulting  in  a  boundary  value  problem.  We  assume  that  the  reader  is 
already  familiar  with  the  basics  of  initial  value  problems  for  ordinary  differential  equations. 
But  we  will  take  time  to  develop  the  perhaps  less  familiar  case  of  boundary  value  problems 
for  ordinary  differential  equations  in  Chapter  6. 
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A  similar  specification  of  auxiliary  conditions  applies  to  partial  differential  equations. 
Equations  modeling  equilibrium  phenomena  are  supplemented  by  boundary  conditions  im¬ 
posed  on  the  boundary  of  the  domain  of  interest.  In  favorable  circumstances,  the  boundary 
conditions  serve  to  single  out  a  unique  solution.  For  example,  the  equilibrium  temperature 
of  a  body  is  uniquely  specified  by  its  boundary  behavior.  If  the  domain  is  unbounded, 
one  must  also  restrict  the  nature  of  the  solution  at  large  distances,  e.g.,  by  asking  that  it 
remain  bounded.  The  combination  of  a  partial  differential  equation  along  with  suitable 
boundary  conditions  is  referred  to  as  a  boundary  value  problem. 

There  are  three  principal  types  of  boundary  value  problems  that  arise  in  most  appli¬ 
cations.  Specifying  the  value  of  the  solution  along  the  boundary  of  the  domain  is  called  a 
Dirichlet  boundary  condition ,  to  honor  the  nineteenth-century  analyst  Johann  Peter  Gus¬ 
tav  Lejeune  Dirichlet.  Specifying  the  normal  derivative  of  the  solution  along  the  boundary 
results  in  a  Neumann  boundary  condition ,  named  after  his  contemporary  Carl  Gottfried 
Neumann.  Prescribing  the  function  along  part  of  the  boundary  and  the  normal  derivative 
along  the  remainder  results  in  a  mixed  boundary  value  problem.  For  example,  in  thermal 
equilibrium,  the  Dirichlet  boundary  value  problem  specifies  the  temperature  of  a  body 
along  its  boundary,  and  our  task  is  to  find  the  interior  temperature  distribution  by  solv¬ 
ing  an  appropriate  partial  differential  equation.  Similarly,  the  Neumann  boundary  value 
problem  prescribes  the  heat  flux  through  the  boundary.  In  particular,  an  insulated  bound¬ 
ary  has  no  heat  flux,  and  hence  the  normal  derivative  of  the  temperature  is  zero  on  the 
boundary.  The  mixed  boundary  value  problem  prescribes  the  temperature  along  part  of 
the  boundary  and  the  heat  flux  along  the  remainder.  Again,  our  task  is  to  determine  the 
interior  temperature  of  the  body. 

For  partial  differential  equations  modeling  dynamical  processes,  in  which  time  is  one  of 
the  independent  variables,  the  solution  is  to  be  specified  by  one  or  more  initial  conditions. 
The  number  of  initial  conditions  required  depends  on  the  highest-order  time  derivative 
that  appears  in  the  equation.  For  example,  in  thermodynamics,  which  involves  only  the 
first-order  time  derivative  of  the  temperature,  the  initial  condition  requires  specifying  the 
temperature  of  the  body  at  the  initial  time.  Newtonian  mechanics  describes  the  accelera¬ 
tion  or  second-order  time  derivative  of  the  motion,  and  so  requires  two  initial  conditions: 
the  initial  position  and  initial  velocity  of  the  system.  On  bounded  domains,  one  must  also 
impose  suitable  boundary  conditions  in  order  to  uniquely  characterize  the  solution  and 
hence  the  subsequent  dynamical  behavior  of  the  physical  system.  The  combination  of  the 
partial  differential  equation,  the  initial  conditions,  and  the  boundary  conditions  leads  to  an 
initial-boundary  value  problem.  We  will  encounter,  and  solve,  many  important  examples 
of  such  problems  during  the  course  of  this  text. 


Remark:  An  additional  consideration  is  that,  besides  any  smoothness  required  by  the 
partial  differential  equation  within  the  domain,  the  solution  and  any  of  its  derivatives 
specified  in  any  initial  or  boundary  condition  should  also  be  continuous  at  the  initial 
or  boundary  point  where  the  condition  is  imposed.  For  example,  if  the  initial  condition 

specifies  the  function  value  n(0,  x)  for  a  <  x  <  6,  while  the  boundary  conditions  specify  the 
du  du 

derivatives  — —  (£,  a)  and  — —  (£,  b)  for  t  >  0,  then,  in  addition  to  any  smoothness  required 
ox  ox 

inside  the  domain  {a  <  x  <  6,  £  >  0},  we  also  require  that  u  be  continuous  at  all  initial 

du 

points  (0,x),  and  that  its  derivative  —  be  continuous  at  all  boundary  points  (£,  a)  and 

(£,&),  in  order  that  u(t,x)  qualify  as  a  classical  solution  to  the  initial-boundary  value 
problem. 
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Exercises 


1.5.  Show  that  the  following  functions  u(x,  y)  define  classical  solutions  to  the  two-dimensional 
Laplace  equation  ^  ^  =  0.  Be  careful  to  specify  an  appropriate  domain. 


dx 2  dy 2 

(a)  ex  cosy,  (b)  l+x2-y2,  (c)  x3-3 xy2,  (d)  log(x2+y2),  (e)  tan_1(y/x),  (f) 


x 2  +  y2 


1.6.  Find  all  solutions  a  =  f(r)  of  the  two-dimensional  Laplace  equation  uxx  +  u  —  0  that 
depend  only  on  the  radial  coordinate  r  =  \jx2  +  y2 . 

1.7.  Find  all  (real)  solutions  to  the  two-dimensional  Laplace  equation  uxx  +  uyy  =  0  of  the  form 
u  =  logp(x,  y),  where  p(x,  y)  is  a  quadratic  polynomial. 

1.8.  (a)  Find  all  quadratic  polynomial  solutions  of  the  three-dimensional  Laplace  equation 

d2u  d2u  d2u  / r  \  t~ i •  i  n  i  i  i  •  t  .  .  .  . 

7— +  77 — o"  +  7-^7  =  0.  (b)  find  all  the  homogeneous  cubic  polynomial  solutions. 

oxz  oyz  ozz 


1.9.  Find  all  polynomial  solutions  p(t ,  x)  of  the  heat  equation  ut  =  uxx  with  degp  <  3. 

1.10.  Show  that  each  of  the  following  functions  u(t,  x)  is  a  solution  to  the  wave  equation 
utt  =  Auxx:  (a)  4 12  —  x2;  (b)  cos(x  +  2t);  (c)  sin2tcosx;  (d)  e~^x~2t^  . 

1.11.  Find  all  polynomial  solutions  p(t,  x)  of  the  wave  equation  utt  =  uxx  with 

(a)  degp  <  2,  (b)  degp  =  3. 

1.12.  Suppose  u(t,x)  and  v(t,x)  are  C2  functions  defined  on  R2  that  satisfy  the  first-order  sys¬ 
tem  of  partial  differential  equations  ut  =  vx,  vt  =  ux. 

(a)  Show  that  both  u  and  v  are  classical  solutions  to  the  wave  equation  utt  =  uxx.  Which 
result  from  multivariable  calculus  do  you  need  to  justify  the  conclusion? 

(b)  Conversely,  given  a  classical  solution  u(t,x)  to  the  wave  equation,  can  you  construct  a 
function  v(t,x)  such  that  u(t,x),v(t,x)  form  a  solution  to  the  first-order  system? 

1.13.  Find  all  solutions  u  =  f(r)  of  the  three-dimensional  Laplace  equation 

uxx  +  uyy  +  uzz  ~  ^  depend  only  on  the  radial  coordinate  r  =  \jx2  +  y2  +  z2 . 

1.14.  Let  u(x,y)  be  defined  on  a  domain  D  C  R2.  Suppose  you  know  that  all  its  second-order 
partial  derivatives,  u  ,u  ,u  ,u  ,  are  defined  and  continuous  on  all  of  D.  Can  you  con- 

t/  t/  t7  t/ 

elude  that  u  E  C2(D)? 


1.15.  Write  down  a  partial  differential  equation  that  has 

(a)  no  real  solutions;  (b)  exactly  one  real  solution;  (c)  exactly  two  real  solutions. 

2  2 
x  —  y 

1.16.  Let  u(x,y)  =  xy  —r - w  for  (x,y)  7^  (0,0),  while  u{ 0,0)  =  0.  Prove  that 


x2  +  y2 


d2u 


(0,0)  =  1^—1 


d2u 


(0,0). 


dx  dy  dy  dx 

Explain  why  this  example  does  not  contradict  the  theorem  on  the  equality  of  mixed  partials. 


Linear  and  Nonlinear  Equations 

As  with  algebraic  equations  and  ordinary  differential  equations,  there  is  a  crucial  distinction 
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between  linear  and  nonlinear  partial  differential  equations,  and  one  must  have  a  firm  grasp 
of  the  linear  theory  before  venturing  into  the  nonlinear  wilderness.  While  linear  algebraic 
equations  are  (modulo  numerical  difficulties)  eminently  solvable  by  a  variety  of  techniques, 
linear  ordinary  differential  equations,  of  order  >  2,  already  present  a  challenge,  as  most 
cannot  be  solved  in  terms  of  elementary  functions.  Indeed,  as  we  will  learn  in  Chapter  11, 
solving  many  of  those  equations  that  arise  in  applications  requires  introducing  new  types 
of  “special  functions”  that  are  typically  not  encountered  in  a  basic  calculus  course.  Linear 
partial  differential  equations  are  of  a  yet  higher  level  of  difficulty,  and  only  a  small  handful 
of  specific  equations  can  be  completely  solved.  Moreover,  explicit  solutions  tend  to  be 
expressible  only  in  the  form  of  infinite  series,  requiring  subtle  analytic  tools  to  understand 
their  convergence  and  properties.  For  the  vast  majority  of  partial  differential  equations,  the 
only  feasible  means  of  producing  general  solutions  is  through  numerical  approximation.  In 
this  book,  we  will  study  the  two  most  basic  numerical  schemes:  finite  differences  and  finite 
elements.  Keep  in  mind  that,  in  order  to  develop  and  understand  numerics  for  partial 
differential  equations,  one  must  already  have  a  good  understanding  of  their  analytical 
properties. 

The  distinguishing  feature  of  linearity  is  that  it  enables  one  to  straightforwardly  com¬ 
bine  solutions  to  form  new  solutions,  through  a  general  Superposition  Principle.  Linear 
superposition  is  universally  applicable  to  all  linear  equations  and  systems,  including  linear 
algebraic  systems,  linear  ordinary  differential  equations,  linear  partial  differential  equa¬ 
tions,  linear  initial  and  boundary  value  problems,  as  well  as  linear  integral  equations, 
linear  control  systems,  and  so  on.  Let  us  introduce  the  basic  idea  in  the  context  of  a  single 
differential  equation. 

A  differential  equation  is  called  homogeneous  linear  if  both  sides  are  sums  of  terms, 
each  of  which  involves  the  dependent  variable  u  or  one  of  its  derivatives  to  the  first  power; 
on  the  other  hand,  there  is  no  restriction  on  how  the  terms  involve  the  independent  vari¬ 
ables.  Thus, 

cPu  u 

dx 2  1  +  x2 

is  a  homogeneous  linear  second-order  ordinary  differential  equation.  Examples  of  homo¬ 
geneous  linear  partial  differential  equations  include  the  heat  equation  (1.5),  the  partial 
differential  equation  (1.2),  and  the  equation 


du 

~dt 


+  cos(x 


On  the  other  hand,  Burgers’  equation 


du 

~dt 


du 

+  u  "X 
dx 


d2u 

dx2 


(1.10) 


is  not  linear,  since  the  second  term  involves  the  product  of  u  and  its  derivative  ux.  A 
similar  terminology  is  applied  to  systems  of  partial  differential  equations.  For  example,  the 
Navier-Stokes  system  (1.4)  is  not  linear  because  of  the  terms  uux,vuy,  etc.  —  although 
its  final  constituent  equation  is  linear. 

A  more  precise  definition  of  a  homogeneous  linear  differential  equation  begins  with  the 
concept  of  a  linear  differential  operator  L.  Such  operators  are  assembled  by  summing  the 
basic  partial  derivative  operators,  with  either  constant  coefficients  or,  more  generally,  coef¬ 
ficients  depending  on  the  independent  variables.  The  operator  acts  on  sufficiently  smooth 
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functions  depending  on  the  relevant  independent  variables.  According  to  Definition  B.32. 
linearity  imposes  two  key  requirements: 


L[u  +  v]  =  L[u]  +  L[v\, 


L[cu ]  =  cL[u\, 


(LH) 


for  any  two  (sufficiently  smooth)  functions  u,  u,  and  any  constant  c. 

Definition  1.2.  A  homogeneous  linear  differential  equation  has  the  form 


L[u]  =  0. 


(1.12) 


where  L  is  a  linear  differential  operator. 


As  a  simple  example,  consider  the  second-order  differential  operator 


L  = 


d 2 


whereby 


L[u  = 


d2 


u 


dx 2  dx 2 

for  any  C2  function  u{x,y).  The  linearity  requirements  (1.11)  follow  immediately  from 
basic  properties  of  differentiation: 


L[u  +  v 
L[cu 


d‘ 


dx 2 
d2 

dx2 


{u  +  v) 
( cu )  = 


d2u  d2v 

:  dC  +  ~  L[u]  +  L[v] 

d2ii 

9?=ciM’ 


which  are  valid  for  any  C2  functions  u,  v  and  any  constant  c.  The  corresponding  homoge¬ 
neous  linear  differential  equation  L[u]  =  0  is 


d2u 

dx2 


=  0. 


The  heat  equation  (1.5)  is  based  on  the  linear  partial  differential  operator 


L  =  dt-  dx, 


with 


L[u]  =  dtu  —  d2u  —  ut  —  =  0. 


'  /y»  ry* 
kAJ  kAJ 


(1.13) 


Linearity  follows  as  above: 


L[u  +  v]  =  dt(u  +  v)  —  d2{u  +  v)  =  ( dtu  —  d2u )  +  (dtv  —  d2v)  =  L[u]  +  L[v], 

L[cu]  =  dt{cu)  —  d2(cu)  =  c  ( dtu  —  d2u )  =  cL[u]. 

Similarly,  the  linear  differential  operator 

L  =  dt  -  dx  K(X)  dx  =  dt  -  K(X)  dl  ~  «'(®)  dXl 

where  k(x)  is  a  prescribed  C1  function  of  x  alone,  defines  the  homogeneous  linear  partial 
differential  equation 

L[u]  =  d2u  -  8x(k(x)  dxu)  =  utt  -  dx(n(x)  ux)  =  utt  -  k(x)  uxx  -  «'(. x)  ux  =  0, 

which  is  used  to  model  vibrations  in  a  nonuniform  one-dimensional  medium. 

The  defining  attributes  of  linear  operators  (1.11)  imply  the  key  properties  shared  by 
all  homogeneous  linear  (differential)  equations. 

Proposition  1.3.  The  sum  of  two  solutions  to  a  homogeneous  linear  differential 
equation  is  again  a  solution ,  as  is  the  product  of  a  solution  with  any  constant. 
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Proof :  Let  u1,u2  be  solutions,  meaning  that  L[n1]  =  0  and  L[u2\ 
to  linearity, 

L[u1  +  u2  \  =  L[u1  ]  +  L[u2]  =  0, 


=  0.  Then,  thanks 


and  hence  their  sum  ux  +u2  is  a  solution.  Similarly,  if  c  is  any  constant  and  u  any  solution, 
then 

L[cu]  =  cL[u]  =  cO  =  0, 


and  so  the  constant  multiple  cu  is  also  a  solution. 


Q.E.D. 


As  a  result,  starting  with  a  handful  of  solutions  to  a  homogeneous  linear  differential 
equation,  by  repeating  these  operations  of  adding  solutions  and  multiplying  by  constants, 
we  are  able  to  build  up  large  families  of  solutions.  In  the  case  of  the  heat  equation  (1.5), 
we  are  already  in  possession  of  two  solutions,  namely  (1.6)  and  (1.7).  Multiplying  each  by 
a  constant  produces  two  infinite  families  of  solutions: 


u(t,  x)  =  c1  (t  +  \  x 2) 


and 


u(t,  x) 


c2e 


—  x2/ (4 1) 


2  \ffft 


where  cx,c2  are  arbitrary  constants.  Moreover,  one  can  add  the  latter  solutions  together, 
producing  a  two-parameter  family  of  solutions 


1  2\  ,  C2  e 


—  x2/ (4 1) 


u(t,  x)  =  Cx  (t  +  ^  X)  + 


2  yrrt 


valid  for  any  choice  of  the  constants  c1 ,  c2 . 

The  preceding  construction  is  a  special  case  of  the  general  Superposition  Principle  for 
homogeneous  linear  equations: 

Theorem  1.4.  If  . . .  ,uk  are  solutions  to  a  common  homogeneous  linear  equation 
L[u ]  =0,  then  the  linear  combination,  or  superposition,  u  =  c1u1  +  •  •  •  +  ckuk  is  a  solution 
for  any  choice  of  constants  c1 , . . . ,  ck . 

Proof :  Repeatedly  applying  the  linearity  requirements  (1.11),  we  find 


L[u]  =  L[c1u1  +  •  •  •  +  ckuk ]  =  L[clUl  +  •  •  •  +  c^u^]  +  L[ckuk 
=  ...  =  L[c1«1]  H  1“  L[ckuk]  =  c1L[«1]  H  \-ckL[uk_. 


(1.14) 


In  particular,  if  the  functions  are  solutions,  so  L[u±]  =0,  ...  ,L[uk]  =  0,  then  the  right- 
hand  side  of  (1.14)  vanishes,  proving  that  u  also  solves  the  equation  L[u]  =  0.  Q.E.D. 

In  the  linear  algebraic  language  of  Appendix  B,  Theorem  1.4  tells  us  that  the  solu¬ 
tions  to  a  homogeneous  linear  partial  differential  equation  form  a  vector  space.  The  same 
holds  true  for  linear  algebraic  equations,  [89],  and  linear  ordinary  differential  equations, 
[18,  20,  23,  52].  In  the  latter  two  situations,  once  one  finds  a  sufficient  number  of  inde¬ 
pendent  solutions,  the  general  solution  is  obtained  as  a  linear  combination  thereof.  In 
the  language  of  linear  algebra,  the  solution  space  is  finite-dimensional.  In  contrast,  most 
linear  systems  of  partial  differential  equations  admit  an  infinite  number  of  independent 
solutions,  meaning  that  the  solution  space  is  infinite-dimensional,  and,  as  a  consequence, 
one  cannot  hope  to  build  the  general  solution  by  taking  finite  linear  combinations.  Instead, 
one  requires  the  far  more  delicate  operation  of  forming  infinite  series  involving  the  basic 
solutions.  Such  considerations  will  soon  lead  us  into  the  heart  of  Fourier  analysis,  and 
require  spending  an  entire  chapter  developing  the  required  analytic  tools. 
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Definition  1.5.  An  inhomogeneous  linear  differential  equation  has  the  form 


L[v]  =  f,  (1.15) 

where  L  is  a  linear  differential  operator,  v  is  the  unknown  function,  and  /  is  a  prescribed 
nonzero  function  of  the  independent  variables  alone. 

For  example,  the  inhomogeneous  form  of  the  heat  equation  (1.13)  is 

L[v]  =  dtv  -  d2xv  =  vt-  vxx  =  f(t,x),  (1.16) 

where  /(t,  x)  is  a  specified  function.  This  equation  models  the  thermodynamics  of  a  one¬ 
dimensional  medium  subject  to  an  external  heat  source. 

You  already  learned  the  basic  technique  for  solving  inhomogeneous  linear  equations 
in  your  study  of  elementary  ordinary  differential  equations.  Step  one  is  to  determine  the 
general  solution  to  the  homogeneous  equation.  Step  two  is  to  find  a  particular  solution  to 
the  inhomogeneous  version.  The  general  solution  to  the  inhomogeneous  equation  is  then 
obtained  by  adding  the  two  together.  Here  is  the  general  version  of  this  procedure: 


Theorem  1.6.  Let  v*  be  a  particular  solution  to  the  inhomogeneous  linear  equation 
L[v^]  =  /.  Then  the  general  solution  to  L[v]  =  f  is  given  by  v  =  v*  +  u,  where  u  is  the 
general  solution  to  the  corresponding  homogeneous  equation  L[u]  =0. 

Proof :  Let  us  first  show  that  v  =  v*  +  u  is  also  a  solution  whenever  L[u]  =  0.  By 
linearity, 

L[v]  =  L[v^  +u]=  L[v J  +  L[u\  =  /  +  0  =  /. 


To  show  that  every  solution  to  the  inhomogeneous  equation  can  be  expressed  in  this  man¬ 
ner,  suppose  v  satisfies  L[v]  =  /.  Set  u  =  v  —  v*.  Then,  by  linearity, 


L[u\  =  L[v  -  v*]  =  L[v]  -  L[v^\  =  0 


and  hence  u  is  a  solution  to  the  homogeneous  differential  equation.  Thus,  v  =  v*  +  u  has 
the  required  form.  Q.E.D. 


In  physical  applications,  one  can  interpret  the  particular  solution  v*  as  a  response  of 
the  system  to  the  external  forcing  function.  The  solution  u  to  the  homogeneous  equation 
represents  the  system’s  internal,  unforced  behavior.  The  general  solution  to  the  inhomo¬ 
geneous  linear  equation  is  thus  a  combination,  v  =  v*  +  u,  of  the  external  and  internal 
responses. 

Finally,  the  Superposition  Principle  for  inhomogeneous  linear  equations  allows  one  to 
combine  the  responses  of  the  system  to  different  external  forcing  functions.  The  proof  of 
this  result  is  left  to  the  reader  as  Exercise  1.26. 


Theorem  1.7.  Let 


L[v 


i  J 


/ 


l? 


L[v 


k  J 


v1,...,vk  be  solutions  to  the  inhomogeneous  linear  systems 
-  fk ,  involving  the  same  linear  operator  L.  Then ,  given  any 
constants  c1,...,ck,  the  linear  combination  v  =  c1v1  H —  •  +  ckvk  solves  the  inhomogeneous 
system  L[v]  =  f  for  the  combined  forcing  function  f  =  c1f1  +  •  •  •  +  ckfk. 


The  two  general  Superposition  Principles  furnish  us  with  powerful  tools  for  solving 
linear  partial  differential  equations,  which  we  shall  repeatedly  exploit  throughout  this  text. 
In  contrast,  nonlinear  partial  differential  equations  are  much  tougher,  and,  typically,  knowl¬ 
edge  of  several  solutions  is  of  scant  help  in  constructing  others.  Indeed,  finding  even  one 
solution  to  a  nonlinear  partial  differential  equation  can  be  quite  a  challenge.  While  this  text 
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will  primarily  concentrate  on  analyzing  the  solutions  and  their  properties  to  some  of  the 
most  basic  and  most  important  linear  partial  differential  equations,  we  will  have  occasion 
to  briefly  venture  into  the  nonlinear  realm,  introducing  some  striking  recent  developments 
in  this  fascinating  arena  of  contemporary  research. 


Exercises 


1.17.  Classify  the  following  differential  equations  as  either 

(y)  homogeneous  linear;  (ii)  inhomogeneous  linear;  or  (Hi)  nonlinear: 

(a)  ut=x2uxx  +  2xux,  (b)  - uxx  -  uyy  =  sin u;  (c)  uxx  +  2yuyy  =  3; 

(d)  ut+uux  =  3w,  (e)  eyux  =  exu  ;  (f)  ut  =  buxxx  +  x2  u  +  x. 

1.18.  Write  down  all  possible  solutions  to  the  Laplace  equation  you  can  construct  from  the  var¬ 
ious  solutions  provided  in  Exercise  1.5  using  linear  superposition. 

1.19.  (a)  Show  that  the  following  functions  are  solutions  to  the  wave  equation  utt  =  4uxx: 

(■ i )  cos(x  —  2 1),  (ii)  ex+2t;  (Hi)  x2  +  2xt  +  4t2 . 

(b)  Write  down  at  least  four  other  solutions  to  the  wave  equation. 


1.20.  The  displacement  u(t,x)  of  a  forced  violin  string  is  modeled  by  the  partial  differential 
equation  utt  =  4 uxx4-F(t,  x).  When  the  string  is  subjected  to  the  external  forcing  F(t,x)  = 

cosay  the  solution  is  u(t,  x)  =  cos(x  —  2t)  +  ^  cosay  while  when  F(t,  x)  =  sin  ay  the  solution 

is  u(t,  x)  =  sin(x  —  2t)  +  \  sina^.  Find  a  solution  when  the  forcing  function  F(t,  x)  is 

(a)  cosa?  —  5 sinay  (b)  sin(a^  —  3). 


1.21.  (a)  Show  that  the  partial  derivatives  d  [/]  =  — —  and  <9,4/1  =  — —  both  define  linear 

ox  y  ay 

operators  on  the  space  of  continuously  differentiable  functions  f(x,y).  (b)  For  which  values 

b)f  Of 

of  a,  6,  c,  d  is  the  differential  operator  L[f]  =  a  — — b  b  — — b  cf  +  d  linear? 


dx 


dy 


1.22.  (a)  Prove  that  the  Laplacian  A  =  <92  +  d2  defines  a  linear  differential  operator. 

(b)  Write  out  the  Laplace  equation  A  [a]  =  0  and  the  Poisson  equation  —A  [a]  =  /. 


1.23.  Prove  that,  on  R  ,  the  gradient,  curl,  and  divergence  all  define  linear  operators. 


1.24.  Let  L  and  M  be  linear  partial  differential  operators.  Prove  that  the  following  are  also 
linear  partial  differential  operators:  (a)  L  —  M,  (b)  3 L,  (c)  /L,  where  /  is  an  arbitrary 
function  of  the  independent  variables;  (d)  L  °  M . 

1.25.  Suppose  L  and  M  are  linear  differential  operators  and  let  N  =  L  +  M . 

(a)  Prove  that  A  is  a  linear  operator,  (b)  True  or  false:  If  u  solves  L[u]  =  /  and  v  solves 
M[v]  =  g ,  then  w  =  u  +  v  solves  7V[ic]  =  /  +  g. 

1.26.  Prove  Theorem  1.7. 


1.27.  Solve  the  following  inhomogeneous  linear  ordinary  differential  equations: 

(a)  u  —  4u  =  x  —  3,  (b)  5u"  —  4u  +  4u  =  ex  cos  ay  (c)  u"  —  3u=e^x. 

1.28.  Use  superposition  to  solve  the  following  inhomogeneous  ordinary  differential  equations: 
(a)  u!  +  2u  =  1  +  cosay  (b)  u"  —  9u  =  x  4~  sin  ay  (c)  9  un  —  l&u  +  10  a  =  1  +  ex  cosay 

(d)  u"  +  u  —  2u  =  sinhay  where  sinhx  =  \(ex  —  e~x),  (e)  u"  +  9 u  =  1  +  e3U 


Chapter  2 

Linear  and  Nonlinear  Waves 


Our  initial  foray  into  the  vast  mathematical  continent  that  comprises  partial  differential 
equations  will  begin  with  some  basic  first-order  equations.  In  applications,  first-order 
partial  differential  equations  are  most  commonly  used  to  describe  dynamical  processes, 
and  so  time,  t,  is  one  of  the  independent  variables.  Our  discussion  will  focus  on  dynamical 
models  in  a  single  space  dimension,  bearing  in  mind  that  most  of  the  methods  we  introduce 
can  be  extended  to  higher-dimensional  situations.  First-order  partial  differential  equations 
and  systems  model  a  wide  variety  of  wave  phenomena,  including  transport  of  pollutants  in 
fluids,  flood  waves,  acoustics,  gas  dynamics,  glacier  motion,  chromatography,  traffic  flow, 
and  various  biological  and  ecological  systems. 

A  basic  solution  technique  relies  on  an  inspired  change  of  variables,  which  comes 
from  rewriting  the  equation  in  a  moving  coordinate  frame.  This  naturally  leads  to  the 
fundamental  concept  of  characteristic  curve,  along  which  signals  and  physical  disturbances 
propagate.  The  resulting  method  of  characteristics  is  able  to  solve  a  first-order  linear 
partial  differential  equation  by  reducing  it  to  one  or  more  first-order  nonlinear  ordinary 
differential  equations. 

Proceeding  to  the  nonlinear  regime,  the  most  important  new  phenomenon  is  the  pos¬ 
sible  breakdown  of  solutions  in  finite  time,  resulting  in  the  formation  of  discontinuous 
shock  waves.  A  familiar  example  is  the  supersonic  boom  produced  by  an  airplane  that 
breaks  the  sound  barrier.  Signals  continue  to  propagate  along  characteristic  curves,  but 
now  the  curves  may  cross  each  other,  precipitating  the  onset  of  a  shock  discontinuity.  The 
ensuing  shock  dynamics  is  not  uniquely  specified  by  the  partial  differential  equation,  but 
relies  on  additional  physical  properties,  to  be  specified  by  an  appropriate  conservation  law 
along  with  a  causality  condition.  A  full-fledged  analysis  of  shock  dynamics  becomes  quite 
challenging,  and  only  the  basics  will  be  developed  here. 

Having  attained  a  basic  understanding  of  first-order  wave  dynamics,  we  then  focus 
our  attention  on  the  first  of  three  paradigmatic  second-order  partial  differential  equations, 
known  as  the  wave  equation,  which  is  used  to  model  waves  and  vibrations  in  an  elastic 
bar,  a  violin  string,  or  a  column  of  air  in  a  wind  instrument.  Its  multi-dimensional  versions 
serve  to  model  vibrations  of  membranes,  solid  bodies,  water  waves,  electromagnetic  waves, 
including  light,  radio  waves,  microwaves,  acoustic  waves,  and  many  other  physical  phenom¬ 
ena.  The  one-dimensional  wave  equation  is  one  of  a  small  handful  of  physically  relevant 
partial  differential  equations  that  has  an  explicit  solution  formula,  originally  discovered  by 
the  eighteenth-century  French  mathematician  (and  encyclopedist)  Jean  d’Alembert.  His 
solution  is  the  result  of  being  able  to  “factorize”  the  second-order  wave  equation  into  a 
pair  of  first-order  partial  differential  equations,  of  a  type  solved  in  the  first  part  of  this 
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chapter.  We  investigate  the  consequences  of  d’Alembert’s  solution  formula  for  the  initial 
value  problem  on  the  entire  real  line;  solutions  on  bounded  intervals  will  be  deferred  until 
Chapter  4.  Unfortunately,  d’Alembert’s  method  is  of  rather  limited  scope,  and  does  not 
extend  beyond  the  one-dimensional  case,  nor  to  equations  modeling  vibrations  of  nonuni¬ 
form  media.  The  analysis  of  the  wave  equation  in  more  than  one  space  dimension  can  be 
found  in  Chapters  11  and  12. 


2.1  Stationary  Waves 


When  entering  a  new  mathematical  subject  —  in  our  case,  partial  differential  equations  — 
one  should  first  analyze  and  fully  understand  the  very  simplest  examples.  Indeed,  mathe¬ 
matics  is,  at  its  core,  a  bootstrapping  enterprise,  in  which  one  builds  on  one’s  knowledge 
of  and  experience  with  elementary  topics  —  in  the  present  case,  ordinary  differential  equa¬ 
tions  —  to  make  progress,  first  with  the  simpler  types  of  partial  differential  equations,  and 
then,  by  developing  and  applying  each  newly  gained  insight  and  technique,  to  more  and 
more  complicated  situations. 

The  simplest  partial  differential  equation,  for  a  function  u(t,x)  of  two  variables,  is 


du 

dt 


It  is  a  first-order,  homogeneous,  linear  equation.  If  (2.1)  were  an  ordinary  differential 
equation^  for  a  function  u(t)  of  t  alone,  the  solution  would  be  obvious:  u(t)  =  c  must  be 
constant.  A  proof  of  this  basic  fact  proceeds  by  integrating  both  sides  with  respect  to  t 
and  then  appealing  to  the  Fundamental  Theorem  of  Calculus.  To  solve  (2.1)  as  a  partial 
differential  equation  for  u(t,  x),  let  us  similarly  integrate  both  sides  of  the  equation  from, 
say,  0  to  t,  producing 


l 


u(t,  x)  —  n(0,  x). 


Therefore,  the  solution  takes  the  form 


u(t,  x)  =  /(x),  where  f{x)=u{ 0,x),  (2.2) 

and  hence  is  a  function  of  the  space  variable  x  alone.  The  only  requirement  is  that  f(x) 
be  continuously  differentiable,  so  /  G  C1,  in  order  that  u(t,x)  be  a  bona  fide  classical 


Of  course,  in  this  situation,  we  would  write  the  equation  as  du/dt  =  0. 


2.1  Stationary  Waves 
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x 


Figure  2.2.  Domain  for  stationary- wave  solution. 


solution  of  the  first-order  partial  differential  equation  (2.1).  The  solution  (2.2)  represents 
a  stationary  wave ,  meaning  that  it  does  not  change  in  time.  The  initial  profile  stays  frozen 
in  place,  and  the  system  remains  in  equilibrium.  Figure  2.1  plots  a  representative  solution 
as  a  function  of  x  at  three  successive  times. 

The  preceding  analysis  seems  very  straightforward  and  perhaps  even  a  little  boring. 
But,  to  be  completely  rigorous,  we  need  to  take  a  bit  more  care.  In  our  derivation,  we 
implicitly  assumed  that  the  solution  u(t,x)  was  defined  everywhere  on  M2.  And,  in  fact, 
the  solution  formula  (2.2)  is  not  completely  valid  as  stated  if  the  solution  u(t,x)  is  defined 
only  on  a  subdomain  D  CM2. 

Indeed,  a  solution  u(t)  to  the  corresponding  ordinary  differential  equation  du/dt  =  0  is 
constant,  provided  it  is  defined  on  a  connected  subinterval  I  C  M.  A  solution  that  is  defined 
on  a  disconnected  subset  D  C  M  need  only  be  constant  on  each  connected  subinterval 
/  C  D.  For  instance,  the  nonconstant  function 


1,  t  >  0, 

—  1,  t  <  0, 


satisfies 


du 

dt 


everywhere  on  its  domain  of  definition,  that  is,  D  =  {t  ^  0},  but  is  constant  only  on  the 
connected  positive  and  negative  half-lines. 

Similar  counterexamples  can  be  constructed  in  the  case  of  the  partial  differential  equa¬ 
tion  (2.1).  If  the  domain  of  definition  is  disconnected,  then  we  do  not  expect  u(t,x)  to 
depend  only  on  x  if  we  move  from  one  connected  component  of  D  to  another.  Even  that 
is  not  the  full  story.  For  example,  the  function 


u(t,  x) 


0,  x  >  0, 

x2,  x  <  0,  t  >  0, 

—  x2,  x  <  0,  t  <  0, 


is  continuously  differentiable^  on  its  domain  of  definition,  namely  D  =  IR2\{  (0,o^)  <  0}? 

satisfies  du/dt  =  0  everywhere  in  D,  but,  nevertheless,  is  not  a  function  of  x  alone,  because, 
for  example,  u( l,x)  =  x2  ^  u(—l,x)  —  —x2. 


You  are  asked  to  rigorously  prove  differentiability  in  Exercise  2.1.10. 
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A  completely  correct  formulation  can  be  stated  as  follows:  If  u(t,x)  is  a  classical 
solution  to  (2.1),  defined  on  a  domain  D  C  M2  whose  intersection  with  any  horizontal^  line, 
namely  Da  —  D  n  {  (£,  a)  \  t  G  M  },  for  each  hxed  a  G  M,  is  either  empty  or  a  connected 
interval,  then  u(t,x)  =  f(x)  is  a  function  of  x  alone.  An  example  of  such  a  domain  is 
sketched  in  Figure  2.2.  In  Exercise  2.1.9,  you  are  asked  to  justify  these  statements. 

We  are  thus  slightly  chastened  in  our  dismissal  of  (2.1)  as  a  complete  triviality.  The 
lesson  is  that,  in  future,  one  must  always  be  careful  when  interpreting  such  “general” 
solution  formulas  —  since  they  often  rely  on  unstated  assumptions  on  their  underlying 
domain  of  definition. 


Exercises 


du 

2.1.1.  Solve  the  partial  differential  equation  —  =  x  for  u(t,x). 

d2u 

2.1.2.  Solve  the  partial  differential  equation  — -rr  —  0  for  u(t,x). 

dtz 

2.1.3.  Find  the  general  solution  u(t,x)  to  the  following  partial  differential  equations: 

(a)  ux  =  0,  (b)  ut  =  1,  (c)  ut  =  x—t ,  (d)  ut+3u  =  0,  (e)  ux-\-tu  =  0,  (f)  utt-\-4u  =  1. 

2.1.4.  Suppose  u(t,x)  is  defined  for  all  (£,  x)  G  M2  and  solves  du/dt  +  2 u  - 
lim  u(t,  x)  =  0  for  all  x. 


0.  Prove  that 


oo 


2.1.5.  Write  down  the  general  solution  to  the  partial  differential  equation  du/dt  =  0  for  a  func¬ 
tion  of  three  variables  u(t,x,y).  What  assumptions  should  be  made  on  the  domain  of  defi¬ 
nition  for  your  solution  formula  to  be  valid? 


2.1.6.  Solve  the  partial  differential  equation 


d2u 
dx  dy 


0  for  u(x,  y). 


2.1.7.  Answer  Exercise  2.1.6  when  u(x,  y ,  z)  depends  on  the  three  independent  variables  x,  y,  z. 


du  2 

G  2.1.8.  Let  u(t,x)  solve  the  initial  value  problem  —  +  u 


■■  0,  i^(0,  x)  =  /(x),  where  f(x)  is  a 

-i 

bounded  C  function  of  x  G  R.  (a)  Show  that  if  f(x)  >  0  for  all  x,  then  u(t,x)  is  defined 
for  all  t  ^ >  0,  and  lim  n,(t,  x)  —  0.  (h)  On  the  other  hand,  if  f(x^  0,  then  the  solution 

t  — y  oo 

u(t,x)  is  not  defined  for  all  t  >  0,  but  in  fact,  lim  u(t,x)  =  —  oo  for  some  0  <  r  <  oo. 

t  — >  r  — 

Given  x,  what  is  the  corresponding  value  of  r?  (c)  Given  /(x)  as  in  part  (b),  what  is  the 
longest  time  interval  0  <  t  <  t*  on  which  u(t,  x)  is  defined  for  all  x  G  R? 

0  2.1.9.  Justify  the  claim  in  the  text  that  if  u(t,x)  is  a  solution  of  du/dt  =  0  that  is  defined  on 

a  domain  D  C  M2  with  the  property  that  Da  —  D  D  {  (t,  a)  \  t  G  R  }  is  either  empty  or  a 
connected  interval,  then  u(t,x)  =  v(x)  depends  only  on  x  G  D. 

2.1.10.  Prove  that  the  function  in  (2.3)  is  continuously  differentiable  at  all  points  (t,x)  in  its 
domain  of  definition. 


+  Important:  We  will  adopt  the  (slightly  unusual)  convention  of  displaying  the  (t,  x)-plane 
with  time  t  along  the  horizontal  axis  and  space  x  along  the  vertical  axis  —  which  also  conforms 
with  our  convention  of  writing  t  before  x  in  expressions  like  u(t,  x).  Later  developments  will  amply 
vindicate  our  adoption  of  this  convention. 


2.2  Transport  and  Traveling  Waves 
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In  many  respects,  the  stationary- wave  equation  (2.1)  does  not  quite  qualify  as  a  partial 
differential  equation.  Indeed,  the  spatial  variable  x  enters  only  parametrically  in  the  so¬ 
lution  to  what  is,  in  essence  (ignoring  technical  difficulties  with  domains),  a  very  simple 
ordinary  differential  equation. 

Let  us  then  turn  to  a  more  “genuine”  example.  Consider  the  linear,  homogeneous 
first-order  partial  differential  equation 


du 

~dt 


+  c 


du 

dx 


for  a  function  u(t,  x),  in  which  c  is  a  fixed,  nonzero  constant,  known  as  the  wave  speed  for 
reasons  that  will  soon  become  apparent.  We  will  refer  to  (2.4)  as  the  transport  equation , 
because  it  models  the  transport  of  a  substance,  e.g.,  a  pollutant,  in  a  uniform  fluid  flow  that 
is  moving  with  velocity  c.  In  this  model,  the  solution  u(t,  x )  represents  the  concentration  of 
the  pollutant  at  time  t  and  spatial  position  x.  Other  common  names  for  (2.4)  are  the  first- 
order  or  unidirectional  wave  equation.  But  for  brevity,  as  well  as  to  avoid  any  confusion 
with  the  second-order,  bidirectional  wave  equation  discussed  extensively  later  on,  we  will 
stick  with  the  designation  “transport  equation”  here.  Solving  the  transport  equation  is 
slightly  more  challenging,  but,  as  we  will  see,  not  difficult. 

Since  the  transport  equation  involves  time,  its  solutions  are  distinguished  by  their 
initial  values.  As  a  first-order  equation,  we  need  only  specify  the  value  of  the  solution  at 
an  initial  time  £0,  leading  to  the  initial  value  problem 


u(t0,x)  =  f(x)  for  all  xGl.  (2-5) 

As  we  will  show,  as  long  as  /  G  C1,  i.e.,  is  continuously  differentiable,  the  initial  conditions 
serve  to  specify  a  unique  classical  solution.  Also,  by  replacing  the  time  variable  t  by  t  —  £0, 
we  can,  without  loss  of  generality,  set  t0  =  0. 


Uniform  Transport 

Let  us  begin  by  assuming  that  the  wave  speed  c  is  constant.  In  general,  when  one  is 
confronted  with  a  new  equation,  one  solution  strategy  is  to  try  to  convert  it  into  an  equation 
that  you  already  know  how  to  solve.  In  this  case,  we  will  introduce  a  simple  change  of 
variables  that  effectively  rewrites  the  equation  in  a  moving  coordinate  system,  inspired  by 
the  interpretation  of  c  as  the  overall  transport  speed. 

If  x  represents  the  position  of  an  object  in  a  fixed  coordinate  frame,  then 

£  =  x  —  ct  (2-6) 

represents  the  object’s  position  relative  to  an  observer  who  is  uniformly  moving  with  ve¬ 
locity  c.  Think  of  a  passenger  in  a  moving  train  to  whom  stationary  objects  appear  to 
be  moving  backwards  at  the  train’s  speed  c.  To  formulate  a  physical  process  in  the  refer¬ 
ence  frame  of  the  passenger,  we  replace  the  stationary  space-time  coordinates  (£,  x)  by  the 
moving  coordinates  (t,  £). 

Remark :  These  are  the  same  changes  of  reference  frame  that  underlie  Einstein’s  spe¬ 
cial  theory  of  relativity.  However,  unlike  Einstein,  we  are  working  in  a  purely  classical, 
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u  u  u 


nonrelativist ic  universe  here.  Such  changes  to  moving  coordinates  are,  in  fact,  of  a  much 
older  vintage,  and  named  Galilean  boosts  in  honor  of  Galileo  Galilei,  who  was  the  first  to 
champion  such  “relativistic”  moving  coordinate  systems. 

Let  us  see  what  happens  when  we  re-express  the  transport  equation  in  terms  of  the 
moving  coordinate  frame.  We  rewrite 


u(t,  x)  =  c(t,  x  —  ct )  =  c(t,  £)  (2-7) 

in  terms  of  the  characteristic  variable  £  =  x  —  ct,  along  with  the  time  t.  To  write  out 
the  differential  equation  satisfied  by  c(t,  £),  we  apply  the  chain  rule  from  multivariable 
calculus,  [8,  108],  to  express  the  derivatives  of  u  in  terms  of  those  of  v: 


du  dv  dv  du  dv 

dt  dt  °  dt;  ’  dx  dt; 


Therefore, 


du  du  dv  dv  dv  dv 

dt  dx  dt  dt;  +  dt;  dt 


We  deduce  that  u(t,x)  solves  the  transport  equation  (2.4)  if  and  only  if  v(t,£)  solves  the 
stationary-wave  equation 


Thus,  the  effect  of  using  a  moving  coordinate  system  is  to  convert  a  wave  moving  with 
velocity  c  into  a  stationary  wave.  Think  again  of  the  passenger  in  the  train  —  a  second 
train  moving  at  the  same  speed  appears  as  if  it  were  stationary. 

According  to  our  earlier  discussion,  the  solution  v  =  v(£)  to  the  stationary- wave 
equation  (2.9)  is  a  function  of  the  characteristic  variable  alone.  (For  simplicity,  we  assume 
that  v(t,£)  has  an  appropriate  domain  of  definition,  e.g.,  it  is  defined  everywhere  on  M2.) 
Recalling  (2.7),  we  conclude  that  the  solution 


u  =  v(£)  =  v(x  —  ct) 

to  the  transport  equation  must  be  a  function  of  the  characteristic  variable  only.  We  have 
therefore  proved  the  following  result: 

Proposition  2.1.  If  u(t,x)  is  a  solution  to  the  partial  differential  equation 


ut  +  cux  0, 

which  is  dehned  on  all  of  IR2,  then 

(2.10) 

u(t,  x)  =  v(x  —  ct), 

(2.11) 

where  v(£)  is  a  C1  function  of  the  characteristic  variable  t;  =  x  —  ct. 
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Figure  2.4.  Characteristic  line. 


In  other  words,  any  (reasonable)  function  of  the  characteristic  variable,  e.g.,  +  1,  or 

cos£,  or  e^,  will  produce  a  corresponding  solution,  {x  —  ct)2  +  l,  or  cos (x  — ct),  or  ex~ct1  to 
the  transport  equation  with  constant  wave  speed  c.  And,  in  accordance  with  the  counting 
principle  of  Chapter  1,  the  general  solution  to  this  first-order  partial  differential  equation 
in  two  independent  variables  depends  on  one  arbitrary  function  of  a  single  variable. 

To  a  stationary  observer,  the  solution  (2.11)  appears  as  a  traveling  wave  of  unchanging 
form  moving  at  constant  velocity  c.  When  c  >  0,  the  wave  translates  to  the  right,  as  illus¬ 
trated  in  Figure  2.3.  When  c  <  0,  the  wave  translates  to  the  left,  while  c  —  0  corresponds 
to  a  stationary  wave  form  that  remains  fixed  at  its  original  location,  as  in  Figure  2.1. 

At  t  =  0,  the  wave  has  the  initial  profile 


'u(f),  x)  =  v(x),  (2.12) 

and  so  (2.11)  provides  the  (unique)  solution  to  the  initial  value  problem  (2.4, 12).  For 
example,  the  solution  to  the  particular  initial  value  problem 


ut  +  2ux  =  0. 


u{  0,  x) 


1 


1  +  X‘ 


IS 


u(t,  x) 


1 


1  +  (x  —  2t)2 


Since  it  depends  only  on  the  characteristic  variable  =  x  —  ct,  every  solution  to  the 
transport  equation  is  constant  on  the  characteristic  lines  of  sloped  c,  namely 


x  =  ct  +  fc,  (2.13) 

where  k  is  an  arbitrary  constant.  At  any  given  time  £,  the  value  of  the  solution  at  posi¬ 
tion  x  depends  only  on  its  original  value  on  the  characteristic  line  passing  through  (t,x). 


^  This  makes  use  of  our  convention  that  the  t— axis  is  horizontal  and  the  x-axis  is  vertical. 
Reversing  the  axes  will  replace  the  slope  by  its  reciprocal. 
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u  u  u 


This  is  indicative  of  a  general  fact  concerning  such  wave  models:  Signals  propagate  along 
characteristics.  Indeed,  a  disturbance  at  an  initial  point  (0,  y)  only  affects  the  value  of  the 
solution  at  points  (£,  x)  that  lie  on  the  characteristic  line  x  =  ct  +  y  emanating  therefrom, 
as  illustrated  in  Figure  2.4. 


Transport  with  Decay 


Let  a  N  0  be  a  positive  constant,  and  c  an  arbitrary  constant.  The  homogeneous  linear 
first-order  partial  differential  equation 


du 

~dt 


+  c 


du 

— — b  au  =  0 
ox 


(2.14) 


models  the  transport  of,  say,  a  radioactively  decaying  solute  in  a  uniform  fluid  flow  with 
wave  speed  c.  The  coefficient  a  governs  the  rate  of  decay.  We  can  solve  this  variant  of  the 
transport  equation  by  the  self-same  change  of  variables  to  a  uniformly  moving  coordinate 
system. 

Rewriting  u(t,  x)  in  terms  of  the  characteristic  variable,  as  in  (2.7),  and  then  recalling 
our  chain  rule  calculation  (2.8),  we  find  that  v(t,£)  =  u{t ,  £  +  ct)  satisfies  the  partial 
differential  equation 


dv 

dt 


+  av  =  0. 


The  result  is,  effectively,  a  homogeneous  linear  first-order  ordinary  differential  equation, 
in  which  the  characteristic  variable  £  enters  only  parametrically.  The  standard  solution 
technique  learned  in  elementary  ordinary  differential  equations,  [20,  23],  tells  us  to  multiply 
the  equation  by  the  exponential  integrating  factor  eat,  leading  to 


■a‘ 1 1 +■ ) = l  <e“vi = °- 

We  conclude  that  w  =  eatv  solves  the  stationary- wave  equation  (2.1).  Thus, 


W  =  eatv  =  /(£),  and  hence  v(t,  £)  =  /(£)  e  at, 

where  /(£)  is  an  arbitrary  function  of  the  characteristic  variable.  Reverting  to  physical 
coordinates,  we  produce  the  solution  formula 


u(t,x)  =  f(x  —  ct)  e  at ,  (2.15) 

which  solves  the  initial  value  problem  u( 0,  x)  =  f{x).  It  represents  a  wave  that  is  moving 
along  with  fixed  velocity  c  while  simultaneously  decaying  at  an  exponential  rate  as  pre¬ 
scribed  by  the  coefficient  a  >  0.  A  typical  solution,  for  c  >  0,  is  plotted  at  three  successive 
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times  in  Figure  2.5.  While  the  solution  (2.15)  is  no  longer  constant  on  the  characteris¬ 
tics,  signals  continue  to  propagate  along  them,  since  a  solution’s  initial  value  at  a  point 
(0,i/)  will  only  affect  its  subsequent  (decaying)  values  on  the  associated  characteristic  line 
x  =  ct  +  y. 


Exercises 

2 

2.2.1.  Find  the  solution  to  the  initial  value  problem  ut  +  ux  —  0,  u(  1,  x)  =  x/(l  +  x  ). 

2.2.2.  Solve  the  following  initial  value  problems  and  graph  the  solutions  at  times  t  =  1,2,  and  3: 

(a)  ut  —  3ux  =  0,  a(0,  x)  =  e  ;  (b)  ut  +  2ux  =  0,  u(— 1,  x)  =  x/(l  +  x  ); 

(c)  ut  +  ux  +  \  u  —  0,  u(0,  x)  =  tan-1  x\  (d)  ut  —  Aux  +  u  =  0,  a(0,  x)  =  1/(1  +  x2). 

2.2.3.  Graph  some  of  the  characteristic  lines  for  the  following  equations,  and  write  down  a 
formula  for  the  general  solution: 

(a)  ut  —  3 ux  =  0,  (b)  ut  +  5 ux  =  0,  (c)  ut  +  ux  +  3a  =  0,  (d)  ut  —  4 ux  +  u  =  0. 

_ 2,  ^ 

2.2.4.  Solve  the  initial  value  problem  ut  +  2^  =  1,  a(0,  x)  =  e 

Hint:  Use  characteristic  coordinates. 

2.2.5.  Answer  Exercise  2.2.4  for  the  initial  value  problem  ut  +  2ux  =  sinx,  a(0,  x)  =  sinx. 

0  2.2.6.  Let  c  be  constant.  Suppose  that  u(t,  x)  solves  the  initial  value  problem  ut  +  cux  =  0, 

a(0,  x)  =  f(x).  Prove  that  v(t,  x)  =  u(t  — 10,  x)  solves  the  initial  value  problem  vt  +  cvx  —  0, 
v(t0,x)  =  /( x). 

2.2.7.  Is  Exercise  2.2.6  valid  when  the  transport  equation  is  replaced  by  the  damped  transport 
equation  (2.14)? 

2.2.8.  Let  c  /  0.  Prove  that  if  the  initial  data  satisfies  tz(0,  x)  =  v{x)  0  as  x  — Too,  then, 
for  each  fixed  x,  the  solution  to  the  transport  equation  (2.4)  satisfies  u(t,  x)  0  as  t  — >>  oo. 

2.2.9.  (a)  Prove  that  if  the  initial  data  is  bounded,  |  f(x)  \  <  M  for  all  x  E  R,  then  the  solu¬ 
tion  to  the  damped  transport  equation  (2.14)  with  a  >  0  satisfies  u(t,x)  — 0  as  t  — oo. 

(b)  Find  a  solution  to  (2.14)  that  is  defined  for  all  (t,x)  but  does  not  satisfy  u(t,x)  0 
as  t  oo. 

1  O 

2.2.10.  Let  F(t  ,  x)  be  a  C  function  of  (t,x)  G  M  .  (a)  Write  down  a  formula  for  the  general 
solution  u(t,x)  to  the  inhomogeneous  partial  differential  equation  ut  —  F(t,x). 

(b)  Solve  the  inhomogeneous  transport  equation  ut  +  cux  =  F(t,  x). 

7?  2.2.11.  (a)  Write  down  a  formula  for  the  general  solution  to  the  nonlinear  partial  differential 
equation  ut  +  ux  +  u2  =  0.  (b)  Show  that  if  the  initial  data  is  positive  and  bounded, 

0  <  a(0,  x)  =  f(x)  <  M,  then  the  solution  exists  for  all  t  >  0,  and  u(t,x)  0  as  t  oo. 

(c)  On  the  other  hand,  if  the  initial  data  is  negative  somewhere,  so  f(x)  <  0  at  some  xGK, 
then  the  solution  blows  up  in  finite  time:  lim  u(t,  y)  =  —  oo  for  some  r  >  0  and  some 

t  — >  T  ~ 

t/Gl.  (d)  Find  a  formula  for  the  earliest  blow-up  time  >  0. 

2.2.12.  A  sensor  situated  at  position  x  —  \  monitors  the  concentration  of  a  pollutant  u(t,  1)  as 
a  function  of  t  for  t  >  0.  Assuming  that  the  pollutant  is  transported  with  wave  speed  c  =  3, 
at  what  locations  x  can  you  determine  the  initial  concentration  u( 0,x)? 

2.2.13.  Write  down  a  solution  to  the  transport  equation  ut  +  2ux  =  0  that  is  defined  on  a 
connected  domain  D  Cl2  and  that  is  not  a  function  of  the  characteristic  variable  alone. 
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2.2.14.  Let  c  N  0.  Consider  the  uniform  transport  equation  u ^  -j-  cux  —  0  restricted  to  the 
quarter-plane  Q  =  {x  >  0,  t  >  0}  and  subject  to  initial  conditions  u{ 0,  x)  =  f(x)  for  x  >  0, 
along  with  boundary  conditions  u(t,  0)  =  g{t)  for  t  >  0.  (a)  For  which  initial  and  bound¬ 
ary  conditions  does  a  classical  solution  to  this  initial-boundary  value  problem  exist?  Write 
down  a  formula  for  the  solution,  (b)  On  which  regions  are  the  effects  of  the  initial  condi¬ 
tions  felt?  What  about  the  boundary  conditions?  Is  there  any  interaction  between  the  two? 

2.2.15.  Answer  Exercise  2.2.14  when  c  <  0. 


Nonuniform  Transport 


Slightly  more  complicated,  but  still  linear,  is  the  nonuniform  transport  equation 

(2- 

where  the  wave  speed  c(x)  is  now  allowed  to  depend  on  the  spatial  position.  Characteristics 
continue  to  guide  the  behavior  of  solutions,  but  when  the  wave  speed  is  not  constant,  we 
can  no  longer  expect  them  to  be  straight  lines.  To  adapt  the  method  of  characteristics, 
let  us  look  at  how  the  solution  varies  along  a  prescribed  curve  in  the  (£,  x)-plane.  Assume 
that  the  curve  is  identified  with  the  graph  of  a  function  x  =  x(t),  and  let 


h{t)  =  u(t,  x(£)) 


be  the  value  of  the  solution  on  it.  We  compute  the  rate  of  change  in  the  solution  along 
the  curve  by  differentiating  h  with  respect  to  t.  Invoking  the  multivariable  chain  rule,  we 
obtain 


dh 

dt 


du 

~dt 


(t,x(t))  + 


dx 

dt 


(2.17) 


In  particular,  if  x(t)  satisfies 


dx 

dt 


then 


du 

dx 


(t,x(t))  =  0, 


since  we  are  assuming  that  u(t,x)  solves  the  transport  equation  (2.16)  for  all  values  of 
(t,x),  including  those  points  (t,x(t))  on  the  curve.  Since  its  derivative  is  zero,  h(t)  must 
be  a  constant,  which  motivates  the  following  definition. 


Definition  2.2.  The  graph  of  a  solution  x(t)  to  the  autonomous  ordinary  differential 
equation 

dx  /  N  . 

-  =  c(x)  (2.18) 

is  called  a  characteristic  curve  for  the  transport  equation  with  wave  speed  c{x). 

In  other  words,  at  each  point  (t,x),  the  slope  of  the  characteristic  curve  equals  the 
wave  speed  c{x)  there.  In  particular,  if  c  is  constant,  the  characteristic  curves  are  straight 
lines  of  slope  c,  in  accordance  with  our  earlier  construction. 


Proposition  2.3.  Solutions  to  the  linear  transport  equation  (2.16)  are  constant 
along  characteristic  curves. 
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Figure  2.6. 


Characteristic  curve. 


The  characteristic  curve  equation  (2.18)  is  an  autonomous  first-order  ordinary  differ¬ 
ential  equation.  As  such,  it  can  be  immediately  solved  by  separating  variables,  [20,  23  . 
Assuming  c(x)  0,  we  divide  both  sides  of  the  equation  by  c(x),  and  then  integrate  the 
resulting  equation: 

dr  C  dx 

——  =  dt,  whereby  (3(x)  :=  /  — —  =  t  +  fc,  (2.19) 

c{x)  J  c(x) 

with  k  denoting  the  integration  constant.  For  each  fixed  value  of  fc,  (2.19)  serves  to  im¬ 
plicitly  define  a  characteristic  curve,  namely, 

x(t)  =  /3_1(£  +  fc), 

with  /3_1  denoting  the  inverse  function.  On  the  other  hand,  if  c(x^)  =  0,  then  x *  is  a 
fixed  point  for  the  ordinary  differential  equation  (2.18),  and  the  horizontal  line  x  =  x*  is  a 
stationary  characteristic  curve. 

Since  the  solution  u(t,  x)  is  constant  along  the  characteristic  curves,  it  must  therefore 
be  a  function  of  the  characteristic  variable 

Z  =  P{x)-t  (2.20) 

alone,  and  hence  of  the  form 

u(t,  x)  =  v(/3(x)  —  t),  (2.21) 

where  is  an  arbitrary  C1  function.  Indeed,  it  is  easy  to  check  directly  that,  provided 
f3(x)  is  defined  by  (2.19),  u(t,  x)  solves  the  partial  differential  equation  (2.16)  for  any  choice 
of  C1  function  u(£).  (But  keep  in  mind  that  the  algebraic  solution  formula  (2.21)  may  fail 
to  be  valid  at  points  where  the  wave  speed  vanishes:  c(x*)  =  0.) 

Warning :  The  definition  of  characteristic  variable  used  here  is  slightly  different  from 
that  in  the  constant  wave  speed  case,  which,  by  (2.20),  would  be  =  x/c  —  t  =  (x  —  ct)/c. 
Clearly,  rescaling  the  characteristic  variable  by  1/c  is  an  inessential  modification  of  our 
original  definition. 
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x 


t 


Figure  2.7.  Characteristic  curves  for  ut  +  ( x 2  +  1)-1  ux  —  0. 


To  find  the  solution  that  satisfies  the  prescribed  initial  conditions 

u(0,x)  =  f(x),  (2.22) 

we  merely  substitute  the  general  solution  formula  (2.21).  This  leads  to  the  implicit  equation 
v((3(x))  =  f(x)  for  the  function  =  f  The  resulting  solution  formula 

u(t,  x)  =  /  °  /3_1  (/3(x)  —  t)  (2.23) 

is  not  particularly  enlightening,  but  it  does  have  a  simple  graphical  interpretation:  To  find 
the  value  of  the  solution  u(t,x),  we  look  at  the  characteristic  curve  passing  through  the 
point  (t,x).  If  this  curve  intersects  the  x-axis  at  the  point  (0 ,  y),  as  in  Figure  2.6,  then 
u(£,  x)  =  u(0,  y)  =  /(y),  since  the  solution  must  be  constant  along  the  curve.  On  the  other 
hand,  if  the  characteristic  curve  through  (t,  x)  doesn’t  intersect  the  ir-axis,  the  solution 
value  n(t,  x)  is  not  prescribed  by  the  initial  data. 

Example  2.4.  Let  ns  solve  the  nonnniform  transport  equation 

du  1  du 
dt  x2  +  1  dx 

by  the  method  of  characteristics.  According  to  (2.18),  the  characteristic  curves  are  the 
graphs  of  solutions  to  the  first-order  ordinary  differential  equation 

dx  1 
dt  x2  +  1 

Separating  variables  and  integrating,  we  obtain 

P(x)  =  J  (x2  +  1)  dx  —  |  x3  +  x  =  t  +  fc,  (2.25) 

where  k  is  the  integration  constant.  Representative  curves  are  plotted  in  Figure  2.7.  (In  this 
case,  inverting  the  function  /?,  i.e.,  solving  (2.25)  for  x  as  a  function  of  t,  is  not  particularly 
enlightening.) 
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According  to  (2.20),  the  characteristic  variable  is  £  =  ^x3  +  x  —  £,  and  hence  the 
general  solution  to  the  equation  takes  the  form 

u  =  v(  ^  x3  +  x  —  t ),  (2.26) 

where  is  an  arbitrary  C1  function.  A  typical  solution,  corresponding  to  initial  data 

u(O,a0=  1  +  (x  +  3)2’  (2- 

is  plotted'* *'  at  the  indicated  times  in  Figure  2.8.  Although  the  solution  remains  constant 
along  each  individual  curve,  a  stationary  observer  will  witness  a  dynamically  changing 
profile  as  the  wave  moves  through  the  nonuniform  medium.  In  this  example,  since  c(pc)  >  0 
everywhere,  the  wave  always  moves  from  left  to  right;  its  speed  as  it  passes  through  a  point 
x  determined  by  the  magnitude  of  c(x)  =  {x2  +  l)-1,  with  the  consequence  that  each  part 
accelerates  as  it  approaches  the  origin  from  the  left,  and  then  slows  back  down  once  it 
passes  by  and  c(x)  decreases  in  magnitude.  To  a  stationary  observer,  the  wave  spreads  out 
as  it  speeds  through  the  origin,  and  then  becomes  progressively  narrower  and  slower  as  it 
gradually  moves  off  to  +oo. 

Example  2.5.  Consider  the  nonuniform  transport  equation 

ut  +  ( x 2  —  1)  ux  =  0.  (2.28) 


t  The  required  function  in  (2.26)  is  implicitly  given  by  the  equation  x3  +  x^j  =  u(0,  x), 

and  so  the  explicit  formula  for  u(t,  x)  is  not  very  instructive  or  useful.  Indeed,  to  make  the  plots, 
we  instead  sampled  the  initial  data  (2.27)  at  a  collection  of  uniformly  spaced  points  y1  <y2< 

•  •  •  <  yn.  Since  the  solution  is  constant  along  the  characteristic  curve  (2.25)  passing  through  each 
sample  point  (0,  yj,  we  can  find  nonuniformly  spaced  sample  values  for  u(t,  xj  at  any  later  time. 
The  smooth  solution  curve  u(t,x)  is  then  approximated  using  spline  interpolation,  [89;  §11.4],  on 
these  sample  values. 
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Figure  2.9.  Characteristic  curves  for  ut  +  ( x 2  —  1  )ux  =  0. 


In  this  case,  the  characteristic  curves  are  the  solutions  to 


and  so 


(2.29) 


One  must  also  include  the  horizontal  lines  x  =  x±  =  d=  1  corresponding  to  the  roots  of 
c(x)  =  x2  —  1.  The  curves  are  graphed  in  Figure  2.9.  Note  that  those  curves  starting  below 
x+  =  1  converge  to  x_  =  —  1  as  t  — »  oo,  while  those  starting  above  x+  =  1  veer  off  to  oo 
in  finite  time.  Owing  to  the  sign  of  c(x)  =  x2  —  1,  points  on  the  graph  of  u(0,  x)  lying  over 


x 


<  1  will  move  to  the  left,  while  those  over  |  x  \  >  1  will  move  to  the  right. 

In  Figure  2.10,  we  graph  several  snapshots  of  the  solution  whose  initial  value  is  a 
bell-shaped  Gaussian  profile 

9 

u(0,  x)  =  e 


—  X 


The  initial  conditions  uniquely  prescribe  the  value  of  the  solution  along  the  characteristic 
curves  that  intersect  the  x-axis.  On  the  other  hand,  if 


x  < 


1  +  e 


2 1 


1  —  e 


2 1 


for  t  >  0, 


the  characteristic  curve  through  (t,  x)  does  not  intersect  the  x-axis,  and  hence  the  value 
of  the  solution  at  such  points,  lying  in  the  shaded  region  in  Figure  2.9,  is  not  prescribed 
by  the  initial  data.  Let  us  arbitrarily  assign  the  solution  to  be  u(t,  x)  =  0  at  such  points. 
At  other  values  of  (t,  x)  with  t  >  0,  the  solution  (2.23)  is 


/  x  +  l  +  (x  —  1)  e  2t\2 
\  x  +  1  —  {pc  —  1)  e~2t  ) 


u(£,  x)  =  exp 


(2.30) 
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Figure  2.10.  Solution  to  ut  +  ( x 2  —  1  )ux  =  0.  [+j 


(The  derivation  of  this  solution  formula  is  left  as  Exercise  2.2.23.)  As  t  increases,  the 
solution’s  peak  becomes  more  and  more  concentrated  near  x_  =  —1,  while  the  section  of 
the  wave  above  x  >  x+  =  1  rapidly  spreads  out  to  oo.  In  the  long  term,  the  solution 
converges  (albeit  nonnniformly)  to  a  step  function  of  height  1/e: 

,  N  .  r  1/e  ^  .367879,  x  >  -1, 

u(t,x)  — >  s(x)  =  <  as  t — >  oo. 

[0,  x  <  —1, 


Let  ns  hnish  by  making  a  few  general  observations  concerning  the  characteristic  curves 
of  transport  equations  whose  wave  speed  c(x)  depends  only  on  the  position  x.  Using  the 
basic  existence  and  uniqueness  theory  for  such  autonomous  ordinary  differential  equations, 
20,  23,  52],  and  assuming  that  c(x)  is  continuously  differentiable:'*' 


•  There  is  a  unique  characteristic  curve  passing  through  each  point  (t,x)  E  M2. 

•  Characteristic  curves  cannot  cross  each  other. 

•  If  t  =  f3(x)  is  a  characteristic  curve,  then  so  are  all  its  horizontal  translates: 

t  =  /3(x)  +  k  for  any  k. 

•  Each  non-horizontal  characteristic  curve  is  the  graph  of  a  strictly  monotone  function. 

Thus,  each  point  on  a  wave  always  moves  in  the  same  direction,  and  can  never 
reverse  its  direction  of  propagation. 

•  As  t  increases,  the  characteristic  curve  either  tends  to  a  fixed  point,  x{t)  -E  x *  as 

t  oo,  with  c{x^)  =  0,  or  goes  off  to  ±oo  in  either  finite  or  infinite  time. 


Proofs  of  these  statements  are  assigned  to  the  reader  in  Exercise  2.2.25. 


t  For  those  who  know  about  such  things,  [18,  52],  this  assumption  can  be  weakened  to  just 
Lipschitz  continuity. 
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Exercises 


2.2.16.  (a)  Find  the  general  solution  to  the  first-order  equation  ut  +  %ux  =  0. 

(b)  Find  a  solution  satisfying  the  initial  condition  u(l,x)  =  sinx.  Is  your  solution  unique? 

2.2.17.  (a)  Solve  the  initial  value  problem  ut  —  xux  =  0,  a(0,  x)  =  ( x 2  +  1)_1. 

(b)  Graph  the  solution  at  times  t  —  0, 1,  2,  3.  (c)  What  is  lim  u(t,  x)l 

t  — ¥  OO 

2.2.18.  Suppose  the  initial  data  u{ 0,  x)  =  /(x)  of  the  nonuniform  transport  equation  (2.28)  is 
continuous  and  satisfies  /(x)  — ^  0  as  |  |  — ^  oo.  What  is  the  limiting  solution  profile  u(t,x) 
as  (a)  t  — oo?  (b)  t  —  oo? 

C  2.2.19.  (a)  Find  and  graph  the  characteristic  curves  for  the  equation  ut  +  (sin  x)ux  =  0. 

~t 

(b)  Write  down  the  solution  with  initial  data  u(0,x)  =  cos  ^  ttx .  (c)  Graph  your  solution 
at  times  t  —  0,1,  2,  3,  5,  and  10.  (d)  What  is  the  limiting  solution  profile  as  t  oo? 

r\ 

2.2.20.  Consider  the  linear  transport  equation  ut  +  (1  +  x  )ux  =  0.  (a)  Find  and  sketch  the 
characteristic  curves,  (b)  Write  down  a  formula  for  the  general  solution,  (c)  Find  the 
solution  to  the  initial  value  problem  u( 0,  x)  =  /(x)  and  discuss  its  behavior  as  t  increases. 

2.2.21.  Prove  that,  for  t  0,  the  speed  of  the  wave  in  Example  2.4  is  asymptotically  propor¬ 
tional  to  t-2/3. 


2.2.22.  Verify  directly  that  formula  (2.21)  defines  a  solution  to  the  differential  equation  (2.16). 


0  2.2.23.  Explain  how  to  derive  the  solution  formula  (2.30).  Justify  that  it  defines  a  solution  to 
equation  (2.28). 

1  1 
2.2.24.  Let  c(x)  be  a  bounded  C  function,  so  |  c(x)  \  <  c*  <  oo  for  all  x.  Let  /(x)  be  any  C 

function.  Prove  that  the  solution  u{t,  x)  to  the  initial  value  problem  ut  +  c(x)  ux  =  0, 
u( 0,  x)  =  f(x),  is  uniquely  defined  for  all  (t,x)  G  M2. 


-i 

G  2.2.25.  Suppose  that  c(x)  G  C  is  continuously  differentiable  for  all  x  G  R.  (a)  Prove  that  the 
characteristic  curves  of  the  transport  equation  (2.16)  cannot  cross  each  other,  (b)  A  point 
where  c(x^)  =  0  is  known  as  a  fixed  point  for  the  characteristic  equation  dx/dt  =  c{x). 
Explain  why  the  characteristic  curve  passing  through  a  fixed  point  (t,x^)  is  a  horizontal 
straight  line,  (c)  Prove  that  if  x  =  g(t)  is  a  characteristic  curve,  then  so  are  all  the  horizon¬ 
tally  translated  curves  x  =  g(t  +  4)  for  any  S.  (d)  True  or  false:  Every  characteristic  curve 
has  the  form  x  =  g(t  +  4),  for  some  fixed  function  g(t).  (e)  Prove  that  each  non- horizontal 
characteristic  curve  is  the  graph  x  =  g(t)  of  a  strictly  monotone  function,  (f)  Explain  why 
a  wave  cannot  reverse  its  direction,  (g)  Show  that  a  non-horizontal  characteristic  curve 
starts,  in  the  distant  past,  t  —  oo,  at  either  a  fixed  point  or  at  — oo  and  ends,  as 
t  — >>  -f-  oo,  at  either  the  next-larger  fixed  point  or  at  +oo. 

du  du 

G  2.2.26.  Consider  the  transport  equation  —  +  c(t,x)  —  =  0  with  time- varying  wave  speed. 


dt 


dx 


dx 


Define  the  corresponding  characteristic  ordinary  differential  equation  to  be  —  =  c(£,  x), 

Laj  L 


the  graphs  of  whose  solutions  x(t)  are  the  characteristic  curves,  (a)  Prove  that  any  so¬ 
lution  u(t,  x)  to  the  partial  differential  equation  is  constant  on  each  characteristic  curve. 

(b)  Suppose  that  the  general  solution  to  the  characteristic  equation  is  written  in  the  form 
£(£,#)  =  k,  where  k  is  an  arbitrary  constant.  Prove  that  £(t,x)  defines  a  characteristic  vari¬ 
able,  meaning  that  u(t,x)  =  /(£(£,  x))  is  a  solution  to  the  time-varying  transport  equation 

for  any  continuously  differentiable  scalar  function  /  G  C1. 


2.2.27.  (a)  Apply  the  method  in  Exercise  2.2.26  to  find  the  characteristic  curves  for  the  equa- 

tion  ut  +  t  ux  =  0.  (b)  Find  the  solution  to  the  initial  value  problem  a(0,  x)  —  e  x  ,  and 

discuss  its  dynamic  behavior. 
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2.2.28.  Solve  Exercise  2.2.27  for  the  equation  ut  +  (x  —  t)ux  =  0. 

T  2.2.29.  Consider  the  first-order  partial  differential  equation  ut  +  (1  —  2 t)ux  =  0.  Use  Exercise 
2.2.26  to:  (a)  Find  and  sketch  the  characteristic  curves,  (b)  Write  down  the  general  solu- 

1 

tion.  (c)  Solve  the  initial  value  problem  with  u(0,  x)  =  - - tj  .  (d)  Describe  the  behavior 

1  ~b  X 

of  your  solution  u(t,x)  from  part  (c)  as  t  oo.  What  about  t  — —  oo? 

2.2.30.  Discuss  which  of  the  conclusions  of  Exercise  2.2.25  are  valid  for  the  characteristic  curves 
of  the  transport  equation  with  time- varying  wave  speed,  as  analyzed  in  Exercise  2.2.26. 

du  du  du 

2.2.31.  Consider  the  two-dimensional  transport  equation  — — b  c(x,y)  — - b  d(x,y)  — —  =  0, 

(y  L  (y  tL  (y  U 

whose  solution  u(t,x,y)  depends  on  time  t  and  space  variables  x,y.  (a)  Define  a  character¬ 
istic  curve,  and  prove  that  the  solution  is  constant  along  it.  (b)  Apply  the  method  of  char¬ 
acteristics  to  solve  the  initial  value  problem  ut  +  yux  —  xuy,  u(0,x,y)  =  . 

(c)  Describe  the  behavior  of  your  solution. 
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The  first-order  nonlinear  partial  differential  equation 


ut  +  uux  =  0  (2.31) 

has  the  form  of  a  transport  equation  (2.4),  but  the  wave  speed  c  —  u  now  depends,  not 
on  the  position  x,  but  rather  on  the  size  of  the  disturbance  u.  Larger  waves  will  move 
faster,  and  overtake  smaller,  slower-moving  waves.  Waves  of  elevation,  where  u  >  0,  move 
to  the  right,  while  waves  of  depression,  where  u  <  0,  move  to  the  left.  This  equation 
is  considerably  more  challenging  than  the  linear  transport  models  analyzed  above,  and 
was  first  systematically  studied  in  the  early  nineteenth  century  by  the  influential  French 
mathematician  Simeon-Denis  Poisson  and  the  great  German  mathematician  Bernhard  Rie- 
mannj  It  and  its  multi-dimensional  and  multi-component  generalizations  play  a  crucial 
role  in  the  modeling  of  gas  dynamics,  acoustics,  shock  waves  in  pipes,  flood  waves  in  rivers, 
chromatography,  chemical  reactions,  traffic  flow,  and  so  on.  Although  we  will  be  able  to 
write  down  a  solution  formula,  the  complete  analysis  is  far  from  trivial,  and  will  require  us 
to  confront  the  possibility  of  discontinuous  shock  waves.  Motivated  readers  are  referred  to 
Whitham’s  book,  [122],  for  further  details. 

Fortunately,  the  method  of  characteristics  that  was  developed  for  linear  transport 
equations  also  works  in  the  present  context  and  leads  to  a  complete  mathematical  solution. 
Mimicking  our  previous  construction,  (2.18),  but  now  with  wave  speed  c  =  u,  let  us  define 
a  characteristic  curve  of  the  nonlinear  wave  equation  (2.31)  to  be  the  graph  of  a  solution 
x(t)  to  the  ordinary  differential  equation 


dx 

dt 


=  u(t,  x). 


(2.32) 


^  In  addition  to  his  fundamental  contributions  to  partial  differential  equations,  complex  anal¬ 
ysis,  and  number  theory,  Riemann  also  was  the  inventor  of  Riemannian  geometry,  which  turned 
out  to  be  absolutely  essential  for  Einstein’s  theory  of  general  relativity  some  70  years  later! 
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As  such,  the  characteristics  depend  upon  the  solution  u,  which,  in  turn,  is  to  be  specified 
by  its  characteristics.  We  appear  to  be  trapped  in  a  circular  argument. 

The  resolution  of  the  conundrum  is  to  argue  that,  as  in  the  linear  case,  the  solution 
u(t,  x )  remains  constant  along  its  characteristics,  and  this  fact  will  allow  us  to  simultane¬ 
ously  specify  both.  To  prove  this  claim,  suppose  that  x  =  x(t)  parametrizes  a  characteristic 
curve  associated  with  the  given  solution  u(t,  x).  Our  task  is  to  show  that  h(t)  =  u(t,x  (*)), 
which  is  obtained  by  evaluating  the  solution  along  the  curve,  is  constant,  which,  as  usual, 
is  proved  by  checking  that  its  derivative  is  identically  zero.  Repeating  our  chain  rule 
computation  (2.17),  and  using  (2.32),  we  deduce  that 


dh 

dt 


du  /  .  x  N  dx  du 


du 

~dt 


(t,  x(t))  +u(t,  x(t)) 


du 

dx 


=  0, 


since  u  is  assumed  to  solve  the  nonlinear  transport  equation  (2.31)  at  all  values  of  (t,x), 
including  those  on  the  characteristic  curve.  We  conclude  that  h(t)  is  constant,  and  hence 
u  is  indeed  constant  on  the  characteristic  curve. 

Now  comes  the  clincher.  We  know  that  the  right-hand  side  of  the  characteristic  ordi¬ 
nary  differential  equation  (2.32)  is  a  constant  whenever  x  =  x(t)  defines  a  characteristic 
curve.  This  means  that  the  derivative  dx/dt  is  a  constant  —  namely  the  fixed  value  of  u 
on  the  curve.  Therefore,  the  characteristic  curve  must  be  a  straight  line , 


x  =  ut  +  k,  (2.33) 

whose  slope  equals  the  value  assumed  by  the  solution  u  on  it. 

And,  as  before,  since  the  solution  is  constant  along  each  characteristic  line,  it  must  be 
a  function  of  the  characteristic  variable 


{;  =  x  —  tu  (2.34) 

alone,  and  so 

u  =  f{x  —  tu),  (2.35) 

where  /(£)  is  an  arbitrary  C1  function.  Formula  (2.35)  should  be  viewed  as  an  algebraic 
equation  that  implicitly  defines  the  solution  u(t,  x)  as  a  function  of  t  and  x.  Verification 
that  the  resulting  function  is  indeed  a  solution  to  (2.31)  is  the  subject  of  Exercise  2.3.14. 

Example  2.6.  Suppose  that 


m  — + pi 


with  a,/3  constant.  Then  (2.35)  becomes 


u  =  a{x  —  tu )  +  /?, 


and  hence 


u(t,  x) 


ax  +  /3 
1  4 -  at 


(2.36) 


is  the  corresponding  solution  to  the  nonlinear  transport  equation.  At  each  fixed  t,  the  graph 
of  the  solution  is  a  straight  line.  If  a  >  0,  the  solution  flattens  out:  u(£,  x)  — 0  as  t  — oo. 
On  the  other  hand,  if  a  <  0,  the  straight  line  rapidly  steepens  to  vertical  as  t  approaches 
the  critical  time  t*  =  —1/a,  at  which  point  the  solution  ceases  to  exist.  Figure  2.11  graphs 
two  representative  solutions.  The  top  row  shows  the  solution  with  a  =  1,  (3  =  .5,  plotted 
at  times  t  —  0, 1,  5,  and  20;  the  bottom  row  takes  a  =  —  .2,  /?  =  .1,  and  plots  the  solution 
at  times  t  —  0,3, 4,  and  4.9.  In  the  second  case,  the  solution  blows  up  by  becoming  vertical 
as  t  — y  5. 
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Figure  2.11. 


Two  solutions  to  ut  +  uux  =  0. 


Remark :  Although  (2.36)  remains  a  valid  solution  formula  after  the  blow-up  time, 
t  >  5,  this  is  not  to  be  viewed  as  a  part  of  the  original  solution.  With  the  appearance  of 
such  a  singularity,  the  physical  solution  has  broken  down,  and  we  stop  tracking  it. 

To  solve  the  general  initial  value  problem 

u(0,  x)  =  /(x),  (2.37) 

we  note  that,  at  t  =  0,  the  implicit  solution  formula  (2.35)  reduces  to  (2.37),  and  hence  the 
function  /  coincides  with  the  initial  data.  However,  because  our  solution  formula  (2.35)  is 
an  implicit  equation,  it  is  not  immediately  evident 

(a)  whether  it  can  be  solved  to  give  a  well-defined  function  u(t,x),  and, 

(b)  even  granted  this,  how  to  describe  the  resulting  solution’s  qualitative  features  and 
dynamical  behavior. 

A  more  instructive  approach  is  founded  on  the  following  geometrical  construction. 
Through  each  point  (0,  y)  on  the  x-axis,  draw  the  characteristic  line 

x  =  tf(y)+y  (2.38) 

whose  slope,  namely  f(y)  =  u(0,  y),  equals  the  value  of  the  initial  data  (2.37)  at  that  point. 
According  to  the  preceding  discussion,  the  solution  will  have  the  same  value  on  the  entire 
characteristic  line  (2.38),  and  so 

u(t,tf(y)  +  y)  =  f(y)  for  all  t.  (2.39) 

For  example,  if  f(y)  =  y,  then  u(t,x)  =  y  whenever  x  =  ty  +  y\  eliminating  y,  we  find 
u(t,x)  =  x/(t  +  1),  which  agrees  with  one  of  our  straight  line  solutions  (2.36). 

Now,  the  problem  with  this  construction  is  immediately  apparent  from  Figure  2.12, 
which  plots  the  characteristic  lines  associated  with  the  initial  data 

u(0,  x)  =  1 7T  —  tan-1  x. 
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Figure  2.12.  Characteristics  lines  for  u{ 0,x)  =  |tt  —  tan  1  x. 


Two  characteristic  lines  that  are  not  parallel  must  cross  each  other  somewhere.  The  value 
of  the  solution  is  supposed  to  equal  the  slope  of  the  characteristic  line  passing  through  the 
point.  Hence,  at  a  crossing  point,  the  solution  is  required  to  assume  two  different  values, 
one  corresponding  to  each  line.  Something  is  clearly  amiss,  and  we  need  to  resolve  this 
apparent  paradox. 

There  are  three  principal  scenarios.  The  first,  trivial,  situation  occurs  when  all  the 
characteristic  lines  are  parallel,  and  so  the  difficulty  does  not  arise.  In  this  case,  they  all 
have  the  same  slope,  say  c,  which  means  that  the  solution  has  the  same  value  on  each  one. 
Therefore,  u(t,  x)  =  c  is  a  constant  solution. 

The  next-simplest  case  occurs  when  the  initial  data  is  everywhere  nondecreasing ,  so 
f(x)  <  f{y)  whenever  x  <  y,  which  is  assured  if  its  derivative  is  never  negative:  ff(x)  >  0. 
In  this  case,  as  sketched  in  Figure  2.13,  the  characteristic  lines  emanating  from  the  x  axis 
fan  out  into  the  right  half-plane,  and  so  never  cross  each  other  at  any  future  time  t  >  0. 
Each  point  (£,  x)  with  t  >  0  lies  on  a  unique  characteristic  line,  and  the  value  of  the 
solution  at  (£,  x)  is  equal  to  the  slope  of  the  line.  We  conclude  that  the  solution  u(t,  x ) 
is  well  defined  at  all  future  times  t  >  0.  Physically,  such  solutions  represent  rarefaction 
waves ,  which  spread  out  as  time  progresses.  A  typical  example,  corresponding  to  initial 
data 

u(f),x)  =  1 7T  +  tan-1  (3x), 

has  its  characteristic  lines  plotted  in  Figure  2.13,  while  Figure  2.14  graphs  some  represen¬ 
tative  solution  profiles. 

The  more  interesting  case  occurs  when  the  initial  data  is  a  decreasing  function,  and  so 
f'(x)  <  0.  Now,  as  in  Figure  2.12,  some  of  the  characteristic  lines  starting  at  t  =  0  will  cross 
at  some  point  in  the  future.  If  a  point  (£,  x)  lies  on  two  or  more  distinct  characteristic  lines, 
the  value  of  the  solution  u(t,x),  which  should  equal  the  characteristic  slope,  is  no  longer 
uniquely  determined.  Although,  in  a  purely  mathematical  context,  one  might  be  tempted 
to  allow  such  multiply  valued  solutions,  from  a  physical  standpoint  this  is  unacceptable. 
The  solution  u(t,x)  is  supposed  to  represent  a  measurable  quantity,  e.g.,  concentration, 
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Figure  2.13.  Characteristic  lines  for  a  rarefaction  wave. 


Figure  2.14.  Rarefaction  wave.  (+J 


velocity,  pressure,  and  must  therefore  assume  a  unique  value  at  each  point.  In  effect,  the 
mathematical  model  has  broken  down  and  no  longer  conforms  to  physical  reality. 

However,  before  confronting  this  difficulty,  let  us  first,  from  a  purely  theoretical  stand¬ 
point,  try  to  understand  what  happens  if  we  mathematically  continue  the  solution  as  a 
multiply  valued  function.  For  specificity,  consider  the  initial  data 

'u(f),  x)  =  1 7r  —  tan-1  x,  (2.40) 

appearing  in  the  first  graph  in  Figure  2.15.  The  corresponding  characteristic  lines  are 
displayed  in  Figure  2.12.  Initially,  they  do  not  cross,  and  the  solution  remains  a  well- 
defined,  single-valued  function.  However,  after  a  while  one  reaches  a  critical  time,  t*  >  0, 
when  the  first  two  characteristic  lines  cross  each  other.  Subsequently,  a  wedge-shaped 
region  appears  in  the  (£,  x)-plane,  consisting  of  points  that  lie  on  the  intersection  of  three 
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t  =  2  t  =  2.5 

Figure  2.15.  Multiply  valued  compression  wave,  [dj 


distinct  characteristic  lines  with  different  slopes;  at  such  points,  the  mathematical  solution 
achieves  three  distinct  values.  Points  outside  the  wedge  he  on  a  single  characteristic  line, 
and  the  solution  remains  single- valued  there.  The  boundary  of  the  wedge  consists  of  points 
where  precisely  two  characteristic  lines  cross. 

To  fully  appreciate  what  is  going  on,  look  now  at  the  sequence  of  pictures  of  the 
multiply  valued  solution  in  Figure  2.15,  plotted  at  six  successive  times.  Since  the  initial 
data  is  positive,  f(x)  >  0,  all  the  characteristic  slopes  are  positive.  As  a  consequence, 
every  point  on  the  solution  curve  moves  to  the  right,  at  a  speed  equal  to  its  height.  Since 
the  initial  data  is  a  decreasing  function,  points  on  the  graph  lying  to  the  left  will  move 
faster  than  those  to  the  right  and  eventually  overtake  them.  At  first,  the  solution  merely 
steepens  into  a  compression  wave.  At  the  critical  time  t*  when  the  first  two  characteristic 
lines  cross,  say  at  position  so  that  (£*,£*)  is  the  tip  of  the  aforementioned  wedge,  the 
solution  graph  has  become  vertical: 


du 

dx 


(*»*  *) 


oo  as  t  — >  , 


and  u(t,x)  is  no  longer  a  classical  solution.  Once  this  occurs,  the  solution  graph  ceases  to 
be  a  single- valued  function,  and  its  overlapping  lobes  lie  over  the  points  (£,  x)  belonging  to 
the  wedge. 

The  critical  time  t*  can,  in  fact,  be  determined  from  the  implicit  solution  formula  (2.35). 
Indeed,  if  we  differentiate  with  respect  to  x,  we  obtain 


du 

dx 


lno  =  m 


dx 


1  -t 


where  ^  =  x  —  tu. 


Solving  for 


du  /'(£) 
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we  see  that  the  slope  blows  up: 


du 

— —  — >  oo  as  t 
ox 


In  other  words,  if  the  initial  data  has  negative  slope  at  position  x,  so  f'(x)  <  0,  then  the 
solution  along  the  characteristic  line  emanating  from  the  point  (0,  x)  will  fail  to  be  smooth 
at  the  time  —  1  / f\x).  The  earliest  critical  time  is,  thus, 


t,  :=  min 


f\x)  <  0 


(2.41) 


If  x0  is  the  value  of  x  that  produces  the  minimum  £*,  then  the  slope  of  the  solution  profile 
will  first  become  infinite  at  the  location  where  the  characteristic  starting  at  x0  is  at  time 
£*,  namely 

=  xo  +  f(xo)t*-  (2-42) 

For  instance,  for  the  particular  initial  configuration  (2.40)  represented  in  Figure  2.15, 


1 

1  +  x2  ’ 


and  so  the  critical  time  is 

£*  =  min  {  1  +  x2  }  =  1,  with  x *  =  /( 0)  £*  =  \  7r, 

since  the  minimum  value  occurs  at  x0  =  0. 

Now,  while  mathematically  plausible,  such  a  multiply  valued  solution  is  physically 
untenable.  So  what  really  happens  after  the  critical  time  £*?  One  needs  to  decide  which 
(if  any)  of  the  possible  solution  values  is  physically  appropriate.  The  mathematical  model, 
in  and  of  itself,  is  incapable  of  resolving  this  quandary.  We  must  therefore  revisit  the 
underlying  physics,  and  ask  what  sort  of  phenomenon  we  are  trying  to  model. 


Shock  Dynamics 

To  be  specific,  let  us  regard  the  transport  equation  (2.31)  as  a  model  of  compressible  fluid 
flow  in  a  single  space  variable,  e.g.,  the  motion  of  gas  in  a  long  pipe.  If  we  push  a  piston 
into  the  pipe,  then  the  gas  will  move  ahead  of  it  and  thereby  be  compressed.  However,  if 
the  piston  moves  too  rapidly,  then  the  gas  piles  up  on  top  of  itself,  and  a  shock  wave  forms 
and  propagates  down  the  pipe.  Mathematically,  the  shock  is  represented  by  a  discontinuity 
where  the  solution  abruptly  changes  value.  The  formulas  (2.41)  and  (2.42)  determine  the 
time  and  position  for  the  onset  of  the  shock-wave  discontinuity.  Our  goal  now  is  to  predict 
its  subsequent  behavior,  and  this  will  be  based  on  use  of  a  suitable  physical  conservation 
law.  Indeed,  one  expects  mass  to  be  conserved  -  even  through  a  shock  discontinuity  — 
since  gas  atoms  can  neither  be  created  nor  destroyed.  And,  as  we  will  see,  conservation  of 
mass  (almost)  suffices  to  prescribe  the  subsequent  motion  of  the  shock  wave. 

Before  investigating  the  implications  of  conservation  of  mass,  let  us  first  convince 
ourselves  of  its  validity  for  the  nonlinear  transport  model.  (Just  because  a  mathematical 
equation  models  a  physical  system  does  not  automatically  imply  that  it  inherits  any  of  its 
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physical  conservation  laws.)  If  u(t,  x )  represents  density,  then,  at  time  t,  the  total  mass 
lying  in  an  interval  a  <  x  <  b  is  calculated  by  integration: 

Mab(t)=  f  u(t,x)dx.  (2.43) 

J  a 

Assuming  that  u(t,x)  is  a  classical  solution  to  the  nonlinear  transport  equation  (2.31),  we 
can  determine  the  rate  of  change  of  mass  on  this  interval  by  differentiation: 


dM. 


a,b 


dt 


d  fb  fb  du  fb  du 

—  /  u(t,x)dx  =  /  —{t,x)dx  =  —  /  u(t:  x)  — — (t,  x)  dx 
dt  J  a  J  a  ut  J  a  OX 


a 


d_ 

dx 


(2.44) 


|n(t,x)2]dx=  —  -u{t,x)' 


—  \  u{t,  a)2 - ■  u{t,  6) 


x  =  a 


The  final  expression  represents  the  net  mass  flux  through  the  endpoints  of  the  interval. 
Thus,  the  only  way  in  which  the  mass  on  the  interval  [a,  b]  changes  is  through  its  endpoints; 
inside,  mass  can  be  neither  created  nor  destroyed,  which  is  the  precise  meaning  of  the  mass 
conservation  law  in  continuum  mechanics.  In  particular,  if  there  is  zero  net  mass  flux,  then 
the  total  mass  is  constant,  and  hence  conserved.  For  example,  if  the  initial  data  (2.37)  has 
finite  total  mass, 

/oo 

f(x)  dx 

-co 

which  requires  that  f(x)  — 0  reasonably  rapidly  as  |  x  |  — )►  oo,  then  the  total  mass  of  the 
solution  —  at  least  up  to  the  formation  of  a  shock  discontinuity  —  remains  constant  and 
equal  to  its  initial  value: 


<  oo, 


(2.45) 


/OO  p  OO  pOO 

u(t,x)dx=  /  u(fl,x)dx=  /  f(x)dx.  (2.46) 

-oo  J —oo  J —oo 

Similarly,  if  u(t,  x)  represents  the  traffic  density  on  a  highway  at  time  t  and  position  x, 
then  the  integrated  conservation  law  (2.44)  tells  us  that  the  rate  of  change  in  the  number 
of  vehicles  on  the  stretch  of  road  between  a  and  b  equals  the  number  of  vehicles  entering 
at  point  a  minus  the  number  leaving  at  point  b  —  which  assumes  that  there  are  no  other 
exits  or  entrances  on  this  part  of  the  highway.  Thus,  in  the  traffic  model,  (2.44)  represents 
the  conservation  of  vehicles. 

The  preceding  calculation  relied  on  the  fact  that  the  integrand  can  be  written  as  an  x 
derivative.  This  is  a  common  feature  of  physical  conservation  laws  in  continuum  mechanics, 
and  motivates  the  following  general  definition. 


Definition  2.7.  A  conservation  law ,  in  one  space  dimension,  is  an  equation  of  the 


form 


dT  dX 
H — ~ —  —  0. 


dt  dx 

The  function  T  is  known  as  the  conserved  density ,  while  X  is  the  associated  flux, 


(2.47) 


In  the  simplest  situations,  the  conserved  density  T(t,x,u)  and  flux  X(t,x,u)  depend 
on  the  time  t,  the  position  x,  and  the  solution  u(t,  x)  to  the  physical  system.  (Higher-order 
conservation  laws,  which  also  depend  on  derivatives  of  u,  arise  in  the  analysis  of  integrable 
partial  differential  equations;  see  Section  8.5  and  [36,87].)  For  example,  the  nonlinear 
transport  equation  (2.31)  is  itself  a  conservation  law,  since  it  can  be  written  in  the  form 

du+l(^)=o, 


dt  dx 


(2.48) 
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and  so  the  conserved  density  is  T  —  u  and  the  flux  is  X  =  \v? .  And  indeed,  it  was 
this  identity  that  made  our  computation  (2.44)  work.  The  general  result,  proved  by  an 
analogous  computation,  justifies  calling  (2.47)  a  conservation  law. 


Proposition  2.8. 

a  <  x  <  b, 


Given  a  conservation  law  (2.47),  then ,  on  any  closed  interval 


(2.49) 


Proof :  The  proof  is  an  immediate  consequence  of  the  Fundamental  Theorem  of  Cal¬ 
culus  —  assuming  sufficient  smoothness  that  allows  one  to  bring  the  derivative  inside  the 
integral  sign: 


d 

dt 


fb  dT  , 

fb  OX  , 

/  dx  —  — 

/  — —  dx  =  —  A 

L  dt 

la  9X 

Q.E.D. 


We  will  refer  to  (2.49)  as  the  integrated  form  of  the  conservation  law  (2.47).  It  states 
that  the  rate  of  change  of  the  total  density,  integrated  over  an  interval,  is  equal  to  the 
amount  of  flux  through  its  two  endpoints.  In  particular,  if  there  is  no  net  flux  into  or  out 
of  the  interval,  then  the  integrated  density  is  conserved ,  meaning  that  it  remains  constant 
over  time.  All  physical  conservation  laws  —  mass,  momentum,  energy,  and  so  on  —  for 
systems  governed  by  partial  differential  equations  are  of  this  form  or  its  multi-dimensional 
extensions,  [87]. 

With  this  in  hand,  let  us  return  to  the  physical  context  of  the  nonlinear  transport 
equation.  By  definition,  a  shock  is  a  discontinuity  in  the  solution  u(t,x).  We  will  make 
the  physically  plausible  assumption  that  mass  (or  vehicle)  conservation  continues  to  hold 
even  within  the  shock.  Recall  that  the  total  mass,  which  at  time  t  is  the  area^  under 
the  curve  u(t,x),  must  be  conserved.  This  continues  to  hold  even  when  the  mathematical 


solution  becomes  multiply  valued,  in  which  case  one  employs  a  line  integral  /  u  dx,  where 

Jc 

C  represents  the  graph  of  the  solution,  to  compute  the  mass/area.  Thus,  to  construct  a 
discontinuous  shock  solution  with  the  same  mass,  one  replaces  part  of  the  multiply  valued 


^  We  are  implicitly  assuming  that  the  mass  is  finite,  as  in  (2.45),  although  the  overall  con¬ 
struction  does  not  rely  on  this  restriction. 
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graph  by  a  vertical  shock  line  in  such  a  way  that  the  resulting  function  is  single- valued  and 
has  the  same  area  under  its  graph.  Referring  to  Figure  2.16,  observe  that  the  region  under 
the  shock  graph  is  obtained  from  that  under  the  multi-valued  solution  graph  by  deleting 
the  upper  shaded  lobe  and  appending  the  lower  shaded  lobe.  Thus  the  resulting  area  will 
be  the  same,  provided  the  shock  line  is  drawn  so  that  the  areas  of  the  two  shaded  lobes  are 
equal.  This  construction  is  known  as  the  Equal  Area  Rule ;  it  ensures  that  the  total  mass 
of  the  shock  solution  matches  that  of  the  multiply  valued  solution,  which  in  turn  is  equal 
to  the  initial  mass,  as  required  by  the  physical  conservation  law. 


Example  2.9.  An  illuminating  special  case  occurs  when  the  initial  data  has  the  form 
of  a  step  function  with  a  single  discontinuity  at  the  origin: 


xx(0,  x) 


а,  x  <  0, 

б,  x  >  0. 


(2.50) 


If  a  >  6,  then  the  initial  data  is  already  in  the  form  of  a  shock  wave.  For  t  >  0,  the 
mathematical  solution  constructed  by  continuing  along  the  characteristic  lines  is  multiply 
valued  in  the  region  bt  <  x  <  at,  where  it  assumes  both  values  a  and  6;  see  Figure  2.17. 
Moreover,  the  initial  vertical  line  of  discontinuity  has  become  a  tilted  line,  because  each 
point  (0,n)  on  it  has  moved  along  the  associated  characteristic  a  distance  ut.  The  Equal 
Area  Rule  tells  us  to  draw  the  shock  line  halfway  along,  at  x  =  |  (a  +  b)  t,  in  order  that  the 
two  triangles  have  the  same  area.  We  deduce  that  the  shock  moves  with  speed  c—  |  (a +  6), 
equal  to  the  average  of  the  two  speeds  at  the  jump.  The  resulting  shock- wave  solution  is 


(X,  x  ct,  cl  T  b 

where  c  =  - 

6,  x  >  ct ,  2 

A  plot  of  its  characteristic  lines  appears  in  Figure  2.18.  Observe  that  colliding  pairs  of 
characteristic  lines  terminate  at  the  shock  line,  whose  slope  is  the  average  of  their  individual 
slopes. 


(2.51) 


The  fact  that  the  shock  speed  equals  the  average  of  the  solution  values  on  either  side 
is,  in  fact,  of  general  validity,  and  is  known  as  the  Rankine-Hugoniot  condition ,  named  af¬ 
ter  the  nineteenth-century  Scottish  physicist  William  Rankine  and  French  engineer  Pierre 
Hugoniot,  although  historically  these  conditions  first  appeared  in  a  1849  paper  by  George 
Stokes,  [109],  However,  intimidated  by  criticism  by  his  contemporary  applied  mathemati¬ 
cians  Lords  Kelvin  and  Rayleigh,  Stokes  thought  he  was  mistaken,  and  even  ended  up 
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Figure  2.18.  Characteristic  lines  for  the  step  wave  shock. 


deleting  the  relevant  part  when  his  collected  works  were  published  in  1883,  [110].  The 
missing  section  was  restored  in  the  1966  reissue,  [111 


Proposition  2.10.  Let  uft,  x)  be  a  solution  to  the  nonlinear  transport  equation  that 
has  a  discontinuity  at  position  x  =  off),  with  finite ,  unequal  left-  and  right-hand  limits 


u  ft)  =  uft,  off)  )=  lim  uft,x),  u+ft)  =  uft,  <r(t)+)  =  lim  uft,x),  (2  52) 

/  ~  x  — >  cr(t)+  '  '  ' 


x  — >  <j{t) 


on  either  side  of  the  shock  discontinuity.  Then ,  to  maintain  conservation  of  mass,  the  speed 
of  the  shock  must  equal  the  average  of  the  solution  values  on  either  side : 


do  u  (t)+n+(t) 


dt 


(2.53) 


Proof :  Referring  to  Figure  2.19,  consider  a  small  time  interval,  from  t  to  t  +  At, 
with  At  >  0.  During  this  time,  the  shock  moves  from  position  a  =  off)  to  position 
b  =  oft  +  At).  The  total  mass  contained  in  the  interval  [a,  b]  at  time  t,  before  the  shock 
has  passed  through,  is 

Mft)  =  f  uft,  x)  dx  «  u+ ft)  (b  —  a)  =  u+ft)  [  oft  +  At)  —  off)  , 

J  a 

where  we  assume  that  At  <C  1  is  very  small,  and  so  the  integrand  is  well  approximated  by 
its  limiting  value  (2.52).  Similarly,  after  the  shock  has  passed,  the  total  mass  remaining  in 
the  interval  is 

Mft  +  At)  =  f  uft  +  At,  x)  dx  ^T(t  +  At)  fb  —  a)  =  u~  ft  +  At)  [ oft  +  At)  —  off)  . 

J  a 
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Thus,  the  rate  of  change  in  mass  across  the  shock  at  time  t  is  given  by 


dM  n  M(t  +  At)-M(t) 

=  Inn  - 

zXt  — y  0 


dt 


At 

=  [“"(*  +  Ai)  -u+(t)]  aiyt  +  All - —=  [u~{t)  -u+(t)]  da 


At  L  7  J  dt 

On  the  other  hand,  at  any  t  <  r  <  t  +  At,  the  mass  flux  into  the  interval  [a,  b]  through 
the  endpoints  is  given  by  the  right-hand  side  of  (2.44): 

|  u(t,  a)2  —  u(t,  b)2  ]  — >  |  u~  (t)2  —  u+(t)2  ] ,  since  r t  as  At  — 0. 

Conservation  of  mass  requires  that  the  rate  of  change  in  mass  be  equal  to  the  mass  flux: 

dM  r  ,  .  ,  ,  v  n  da 


dt 


u  (t)  —  u+(t) 


dt 


\[u  (t)2-u+(t)2] 


Solving  for  da/dt  establishes  (2.53). 


Q.E.D. 


Example  2.11.  By  way  of  contrast,  let  us  investigate  the  case  when  the  initial  data 
is  a  step  function  (2.50),  but  with  a  <  6,  so  the  jump  goes  upwards.  In  this  case,  the 
characteristic  lines  diverge  from  the  initial  discontinuity,  and  the  mathematical  solution  is 
not  specified  at  all  in  the  wedge-shaped  region  at  <  x  <  bt.  Our  task  is  to  decide  how  to 
“fill  in”  the  solution  values  between  the  two  regions  where  the  solution  is  well  defined  and 
constant. 

One  possible  connection  is  by  a  straight  line.  Indeed,  a  simple  modification  of  the 
rational  solution  (2.36)  produces  the  similarity  solution t 


u(t,  x)  = 


x 

I’ 


t  See  Section  8.2  for  general  techniques  for  constructing  similarity  (scale-invariant)  solutions 
to  partial  differential  equations. 
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Rarefaction  wave.  (+J 


which  not  only  solves  the  differential  equation,  but  also  has  the  required  values  u(t,  at)  =  a 
and  u(t,bt)  =  b  at  the  two  edges  of  the  wedge.  This  can  be  used  to  construct  the  piecewise 
affine  rarefaction  wave 

{a,  x  <  at , 

x/t,  at  <  x  <  bt ,  (2.54) 

6,  x  >  bt , 

which  is  graphed  at  four  representative  times  in  Figure  2.20. 

A  second  possibility  would  be  to  continue  the  discontinuity  as  a  shock  wave,  whose 
speed  is  governed  by  the  Rankine-Hugoniot  condition,  leading  to  a  discontinuous  solution 
having  the  same  formula  as  (2.51).  Which  of  the  two  competing  solutions  should  we 
use?  The  first,  (2.54),  makes  better  physical  sense;  indeed,  if  we  were  to  smooth  out  the 
discontinuity,  then  the  resulting  solutions  would  converge  to  the  rarefaction  wave  and  not 
the  reverse  shock  wave;  see  Exercise  2.3.13.  Moreover,  the  discontinuous  solution  (2.51) 
has  characteristic  lines  emanating  from  the  discontinuity,  which  means  that  the  shock  is 
creating  new  values  for  the  solution  as  it  moves  along,  and  this  can,  in  fact,  be  done  in  a 
variety  of  ways.  In  other  words,  the  discontinuous  solution  violates  causality ,  meaning  that 
the  solution  profile  at  any  given  time  uniquely  prescribes  its  subsequent  motion.  Causality 
requires  that,  while  characteristics  may  terminate  at  a  shock  discontinuity,  they  cannot 
begin  there,  because  their  slopes  will  not  be  uniquely  prescribed  by  the  shock  profile,  and 
hence  the  characteristics  to  the  left  of  the  shock  must  have  larger  slope  (or  speed),  while 
those  to  the  right  must  have  smaller  slope.  Since  the  shock  speed  is  the  average  of  the  two 
characteristic  slopes,  this  requires  the  Entropy  Condition 


u 


(*)> 


da  u  ( t)-\-u+(t ) 


>  u+(t) 


(2.55) 


dt  2 

With  further  analysis,  it  can  be  shown,  [57],  that  the  rarefaction  wave  (2.54)  is  the  unique 
solution^  to  the  initial  value  problem  satisfying  the  entropy  condition  (2.55). 


Albeit  not  a  classical  solution,  but  rather  a  weak  solution,  as  per  Section  10.4. 


44 


2  Linear  and  Nonlinear  Waves 


u 


These  prototypical  solutions  epitomize  the  basic  phenomena  modeled  by  the  nonlinear 
transport  equation:  rarefaction  waves ,  which  emanate  from  regions  where  the  initial  data 
satisfies  ff(x)  >  0,  causing  the  solution  to  spread  out  as  time  progresses,  and  compression 
waves ,  emanting  from  regions  where  f'(x )  <  0,  causing  the  solution  to  progressively  steepen 
and  eventually  break  into  a  shock  discontinuity.  Anyone  caught  in  a  traffic  jam  recognizes 
the  compression  waves,  where  the  vehicles  are  bunched  together  and  almost  stationary, 
while  the  interspersed  rarefaction  waves  correspond  to  freely  moving  traffic.  (An  intelligent 
driver  will  take  advantage  of  the  rarefaction  waves  moving  backwards  through  the  jam 
to  switch  lanes!)  The  familiar,  frustrating  traffic  jam  phenomenon,  even  on  accident-  or 
construction-free  stretches  of  highway,  is,  thus,  an  intrinsic  effect  of  the  nonlinear  transport 
models  that  govern  traffic  flow,  [122]. 


Example  2.12. 


Triangular  wave : 


Suppose  the  initial  data  has  the  triangular  profile 


u(  0,  x)  =  f{x) 


x ,  0  <  x  <  1, 

0,  otherwise, 


as  in  the  first  graph  in  Figure  2.22.  The  initial  discontinuity  at  x  =  1  will  propagate  as  a 
shock  wave,  while  the  slanted  line  behaves  as  a  rarefaction  wave.  To  find  the  profile  at  time 
t,  we  first  graph  the  multi-valued  solution  obtained  by  moving  each  point  on  the  graph  of 
/  to  the  right  an  amount  equal  to  t  times  its  height.  As  noted  above,  this  motion  preserves 
straight  lines.  Thus,  points  on  the  x-axis  remain  fixed,  and  the  diagonal  line  now  goes 
from  (0,  0)  to  (1  +  t,  1),  which  is  where  the  uppermost  point  (1,1)  on  the  graph  of  /  has 
moved  to,  and  hence  has  slope  (1  +  t)_1,  while  the  initial  vertical  shock  line  has  become 
tilted,  going  from  (1,0)  to  (0, 1  +  t).  We  now  need  to  find  the  position  a{t)  of  the  shock 
line  in  order  to  satisfy  the  Equal  Area  Rule,  namely  so  that  the  areas  of  the  two  shaded 
regions  in  Figure  2.21  are  identical.  The  reader  is  invited  to  determine  this  geometrically; 
instead,  we  invoke  the  Rankine-Hugoniot  condition  (2.53).  At  the  shock  line,  x  =  <r(t), 
the  left-  and  right-hand  limiting  values  are,  respectively, 


u  (t)  =  u(t,  a  it)  ) 


<T(t) 
l  +  r 


u+{t)  =  u(t,a(t)+)  =  0, 


and  hence  (2.53)  prescribes  the  shock  speed  to  be 


da 


1 


2(l  +  t)  ' 


dt  2 


1  +  t 
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Figure  2.22.  Triangular- wave  solution.  (+J 


x 


Figure  2.23.  Characteristic  lines  for  the  triangular- wave  shock. 


The  solution  to  the  resulting  separable  ordinary  differential  equation  is  easily  found.  Since 
the  shock  starts  out  at  <j(0)  =  1,  we  deduce  that 


cr(t)  =  y/1  +  £ , 


With  i _ 

dt  2\/l  + 1 


Further,  the  strength  of  the  shock,  namely  its  height,  is 


u  (t) 


°V) 

1  + 1 


1 

VT+t' 


We  conclude  that,  as  t  increases,  the  solution  remains  a  triangular  wave,  of  steadily  decreas¬ 
ing  slope,  while  the  shock  moves  off  to  x  =  Too  at  a  progressively  slower  speed  and  smaller 
height.  Its  position  follows  a  parabolic  trajectory  in  the  (£,  x)-plane.  See  Figure  2.22  for 
representative  plots  of  the  triangular  wave  solution,  while  Figure  2.23  illustrates  the  char¬ 
acteristic  lines  and  shock  wave  trajectory. 


In  more  general  situations,  continuing  on  after  the  initial  shock  formation,  other  char¬ 
acteristic  lines  may  start  to  cross,  thereby  producing  new  shocks.  The  shocks  themselves 
continue  to  propagate,  often  at  different  velocities.  When  a  fast-moving  shock  catches  up 
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with  a  slow-moving  shock,  one  must  then  decide  how  to  merge  the  shocks  so  as  to  retain  a 
physically  meaningful  solution.  The  Rankine-Hugoniot  (Equal  Area)  and  Entropy  Condi¬ 
tions  continue  to  uniquely  specify  the  dynamics.  However,  at  this  point,  the  mathematical 
details  have  become  too  intricate  for  us  to  pursue  any  further,  and  we  refer  the  interested 
reader  to  Whitham’s  book,  [122].  See  also  [57]  for  a  proof  of  the  following  existence 
theorem  for  shock-wave  solutions  to  the  nonlinear  transport  equation. 


Theorem  2.13.  If  the  initial  data  u( 0,  x)  =  f(x)  is  piecewise ^  C1  with  hnitely  many 
jump  discontinuities ,  then ,  for  t  >  0,  there  exists  a  unique  (weak)  solution  to  the  nonlinear 
transport  equation  (2.31)  that  also  satishes  the  Rankine-Hugoniot  condition  (2.53)  and 
the  entropy  condition  (2.55). 


Remark :  Our  derivation  of  the  Rankine-Hugoniot  shock  speed  condition  (2.53)  relied 
on  the  fact  that  we  can  write  the  original  partial  differential  equation  in  the  form  of  a 
conservation  law.  But  there  are,  in  fact,  other  ways  to  do  this.  For  instance,  multiplying  the 
nonlinear  transport  equation  (2.31)  by  u  allows  us  write  it  in  the  alternative  conservative 
form 


u 


du 

dt 


T  U 


du  d 


d 


dx  dt 


(i«2)  +  £(£«3)=0. 


In  this  formulation,  the  conserved  density  is  T  =  t^u2,  and  the  associated  flux  is  X 
The  integrated  form  (2.49)  of  the  conservation  law  (2.56)  is 


(2.56) 


lu3 

3  a 


d 

dt 


1 

2 


u(t,  x)2 


dx 


l 

3 


u(t,  a)3  —  u(t,  b )3  ] . 


(2.57) 


In  some  physical  models,  the  integral  on  the  left-hand  side  represents  the  energy  within  the 
interval  [a,  6],  and  the  conservation  law  tells  us  that  energy  can  enter  the  interval  as  a  flux 
only  through  its  ends.  If  we  assume  that  energy  is  conserved  at  a  shock,  then,  repeating 
our  previous  argument,  we  are  led  to  the  alternative  equation 


do 

dt 


2  u  (t)2  +  u  (t)  u+  (t)  +  u+  (t)2 

3  u~  (t)  +  u+(t) 


(2.58) 


for  the  shock  speed.  Thus,  a  shock  that  conserves  energy  moves  at  a  different  speed  from 
one  that  conserves  mass!  The  evolution  of  a  shock  wave  depends  not  just  on  the  underlying 
differential  equation,  but  also  on  the  physical  assumptions  governing  the  selection  of  a 
suitable  conservation  law. 


More  General  Wave  Speeds 

Let  us  finish  this  section  by  considering  a  nonlinear  transport  equation 

ut  +  c(u)  ux  =  0,  (2.59) 

whose  wave  speed  is  a  more  general  function  of  the  disturbance  u.  (Further  extensions, 
allowing  c  to  depend  also  on  t  and  x,  are  discussed  in  Exercise  2.3.20.)  Most  of  the 


^  Meaning  continuous  everywhere,  and  continuously  differentiable  except  at  a  discrete  set  of 
points;  see  Definition  3.7  below  for  the  precise  definition. 


2.3  Nonlinear  Transport  and  Shocks 
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development  is  directly  parallel  to  the  special  case  (2.31)  discussed  above,  and  so  the 
details  are  left  for  the  reader  to  fill  in,  although  the  shock  dynamics  does  require  some 
care. 

In  this  case,  the  characteristic  curve  equation  is 

—  =  c(i/(£,  x)).  (2.60) 

As  before,  the  solution  u  is  constant  on  characteristics,  and  hence  the  characteristics  are 
straight  lines,  now  with  slope  c(u).  Thus,  to  solve  the  initial  value  problem 


u(0,  x)  =  /(x),  (2.61) 

through  each  point  (0,  y)  on  the  x-axis,  one  draws  the  characteristic  line  of  slope  c{u{ 0,  y))  = 
c(f(y)).  Until  the  onset  of  a  shock  discontinuity,  the  solution  maintains  its  initial  value 
u(0,y)  =  f(y)  along  the  characteristic  line. 

A  shock  forms  whenever  two  characteristic  lines  cross.  As  before,  the  mathematical 
equation  no  longer  uniquely  specifies  the  subsequent  dynamics,  and  we  need  to  appeal  to 
an  appropriate  conservation  law.  We  write  the  transport  equation  in  the  form 

Ou  0  C 

— — b  —C(u)  =  0,  where  C(u)  =  /  c(u)  du  (2.62) 

(y  L  iAj  J 

is  any  convenient  anti- derivative  of  the  wave  speed.  Thus,  following  the  same  computation 
as  in  (2.44),  we  discover  that  conservation  of  mass  now  takes  the  integrated  form 


d 

dt 


u(t,  x)  dx  =  C(u(t,  a))  —  C(u(t,  b )) 


(2.63) 


with  C{u)  playing  the  role  of  the  mass  flux.  Requiring  the  conservation  of  mass,  i.e.,  of 
the  area  under  the  graph  of  the  solution,  means  that  the  Equal  Area  Rule  remains  valid. 
However,  the  Rankine-Hugoniot  shock-speed  condition  must  be  modified  in  accordance 
with  the  new  dynamics.  Mimicking  the  preceding  argument,  but  with  the  modified  mass 
flux,  we  find  that  the  shock  speed  is  now  given  by 


Note  that  if 


da  C(u  (£))  —  C{u+(t )) 

dt  u~{t)  —  u+(t) 

c{u )  =  u,  then  C{u)  =  /  udu  =  |u2, 


(2.64) 


and  so  (2.64)  reduces  to  our  earlier  formula  (2.53).  Moreover,  in  the  limit  as  the  shock 
magnitude  approaches  zero,  u~  (t)  —  u+{t)  -T  0,  the  right-hand  side  of  (2.64)  converges  to 
the  derivative  Cf(u)  =  c{u )  and  hence  recovers  the  wave  speed,  as  it  should. 


Exercises 


2.3.1.  Discuss  the  behavior  of  the  solution  to  the  nonlinear  transport  equation  (2.31)  for  the 
following  initial  data: 


f  2,  x  <  —  1, 
\  1,  x  >  — 1; 


(b)  u( 0,  x) 


—2,  x  <  —  1, 
1,  x  >  — 1; 


(c)  u(0,  x) 


1,  X  <  1, 

—2,  x  >  1. 


(a)  u(0,  x) 
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2.3.2.  Solve  the  following  initial  value  problems: 


(a)  ut  +  3uux  =  0,  u(0,x) 


(b)  ut  —  uux  =  0,  u(l,x) 


—  1,  x  <  0, 

3,  x  >  0; 


(c)  ut  —  2uux  =  0,  u(0,x) 


2,  x  <  1, 

0,  x  >  1; 

1,  x  <  1, 

0,  x  >  1. 


2.3.3.  Let  a(0,  x)  =  (ar  +  l)-  .  Does  the  resulting  solution  to  the  nonlinear  transport  equation 
(2.31)  produce  a  shock  wave?  If  so,  find  the  time  of  onset  of  the  shock,  and  sketch  a  graph 
of  the  solution  just  before  and  soon  after  the  shock  wave.  If  not,  explain  what  happens  to 
the  solution  as  t  increases. 


2.3.4.  Solve  Exercise  2.3.3  when  a(0,  x)  =  (a)  —  ( x 2  +  1) 


-l 


( b )  x(x2  +  1) 


-l 


2.3.5.  Consider  the  initial  value  problem  ut  —  2 uux  =  0,  a(0,  x)  =  e~  .  Does  the  resulting 
solution  produce  a  shock  wave?  If  so,  find  the  time  of  onset  of  the  shock  and  the  position 
at  which  it  first  forms.  If  not,  explain  what  happens  to  the  solution  as  t  increases. 

2.3.6.  (a)  For  what  values  of  a,  /?,  7,  S  is  u(t,  x)  =  ^  ^  a  solution  to  (2.31)? 

(b)  For  what  values  of  a,  /?,  7,  S ,  A,  /i  is  u(t,  x)  =  ^  ax  @  a  solution  to  (2.31)? 

7*  t  |  1 /  x  j  0 

2.3.7.  A  triangular  wave  is  a  shock- wave  solution  to  the  initial  value  problem  for  (2.31)  that 


has  initial  data  u{ 0,  x)  = 


mx,  0  <  x  < 


Assuming  m  >  0,  write  down  a  formula  for 


0,  otherwise. 

the  triangular- wave  solution  at  times  t  >  0.  Discuss  what  happens  to  the  triangular  wave  as 
time  progresses. 


2.3.8.  Solve  Exercise  2.3.7  when  m  <  0. 


2.3.9.  Solve  (2.31)  for  t  >  0  subject  to  the  following  initial  conditions,  and  graph  your  solution 
at  some  representative  times.  In  what  sense  does  your  solution  conserve  mass? 

1,  0  x  <C  1,  /i\  /q  \  f  1  <C  x  <C  1, 

0,  otherwise,  l  u\  5 x)  1  otherwise, 


(a)  a(0,  x) 


(c)  u{ 0,  x)  = 


0. 


x,  —  1  <  x  <  1, 
otherwise, 


(d)  u{ 0,  x)  = 


1 

0. 


x 


—  1  <  X  <  1, 

otherwise. 


2.3.10.  An  N-wave  is  a  solution  to  the  nonlinear  transport  equation  (2.31)  that  has  initial  con¬ 
ditions  u(0,x)  =  {  £  _  x  wpere  772  >  0.  (a)  Write  down  a  formula  for  the 

(0,  otherwise, 

TV-wave  solution  at  times  t  >  0.  (b)  What  about  when  m  <  0? 

^  2.3.11.  Suppose  u(t,x)  and  u(t,x)  are  two  solutions  to  the  nonlinear  transport  equation  (2.31) 
such  that,  for  some  K  >  °>  they  agree:  u(t^,x)  =  u(t*,x)  for  all  x.  Do  the  solutions  nec¬ 
essarily  have  the  same  initial  conditions:  u(0,x)  =  u{ 0,  #)?  Use  your  answer  to  discuss  the 
uniqueness  of  solutions  to  the  nonlinear  transport  equation. 


2.3.12.  Suppose  that  x1  <  x2  are  such  that  the  characteristic  lines  of  (2.31)  through  (0,  xx) 
and  (0,  x2)  cross  at  a  shock  at  (t,a(t))  and,  moreover,  the  left-  and  right-hand  shock  values 

(2.52)  are  /(xx)  =  u~  (t),  f(x1)  —  u+  (t).  Explain  why  the  signed  area  of  the  region  between 
the  graph  of  f(x)  and  the  secant  line  connecting  (x1,/(x1))  to  (x2,  f(x2))  is  zero. 

-1 

^  2.3.13.  Consider  the  initial  value  problem  u£( 0,x)  =  2  +  tan-  (x/e)  for  the  nonlinear  trans¬ 
port  equation  (2.31).  (a)  Show  that,  as  £  —)>  0+,  the  initial  condition  converges  to  a  step 
function  (2.51).  What  are  the  values  of  a,  b?  ( b )  Show  that,  moreover,  the  resulting  solu¬ 
tion  iA(0,x)  to  the  nonlinear  transport  equation  converges  to  the  corresponding  rarefaction 

wave  (2.54)  resulting  from  the  limiting  initial  condition. 


2.4  The  Wave  Equation:  d’Alembert’s  Formula 
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0  2.3.14.  (a)  Under  what  conditions  can  equation  (2.35)  be  solved  for  a  single- valued  function 

u(t,x)?  Hint :  Use  the  Implicit  Function  Theorem,  (b)  Use  implicit  differentiation  to  prove 
that  the  resulting  function  u(t,  x)  is  a  solution  to  the  nonlinear  transport  equation. 

,k 


2.3.15.  For  what  values  of  a,  /?,  7,  5,  k  is  u(t,  x) 

o 

tion  u+  +  u  =  0? 


ax  +  P 
7 1  +  5 


a  solution  to  the  transport  equa- 


X 


<2 

2.3.16.  (a)  Solve  the  initial  value  problem  ut  +  u  ux  =0,  u(0,  x)  =  /(#),  by  the  method  of 
characteristics,  (b)  Discuss  the  behavior  of  solutions  and  compare/contrast  with  (2.31). 

2.3.17.  (a)  Determine  the  Rankine-Hugoniot  condition,  based  on  conservation  of  mass,  for  the 

speed  of  a  shock  for  the  equation  ut  +  u2  ux  =  0.  (b)  Solve  the  initial  value  problem 

a,  x  <  0. 


u( 0,  x) 


6,  x  >  0. 


when  (i)  \a\  >  |6|,  (ii)  \a\  <  \b\.  Hint :  Use  Exercise  2.3.15 


to  determine  the  shape  of  a  rarefaction  wave. 

2.3.18.  Solve  Exercise  2.3.17  when  the  wave  speed  c(u) 
0  2.3.19.  Justify  the  shock-speed  formula  (2.58). 


Q 

(i)  1  —  2 u,  (ii)  u  ,  (in)  sinn. 


^  2.3.20.  Consider  the  general  quasilinear  first-order  partial  differential  equation 

+  c(t,  x,  u)  ^  =  h(t ,  x,  u). 
at  ox 

Let  us  define  a  lifted  characteristic  curve  to  be  a  solution  (t,x(t),u(t))  to  the  system  of  or¬ 
dinary  differential  equations  ^  =  c(t,x,u),  ^  =  h(t,  x,u).  The  corresponding  charac¬ 
teristic  curve  (t,x(t)^  is  obtained  by  projecting  to  the  (t,x)- plane.  Prove  that  if  u(t,x)  is  a 

solution  to  the  partial  differential  equation,  and  n(t0,x0)  =  n0,  then  the  lifted  characteristic 
curve  passing  through  (£q,£0,Uq)  lies  on  the  graph  of  u(t,x).  Conclude  that  the  graph  of 
the  solution  to  the  initial  value  problem  u(t0,x)  =  f(x)  is  the  union  of  all  lifted  characteris¬ 
tic  curves  passing  through  the  initial  data  points  (t0,  x0,  /(#0)) . 

2.3.21.  Let  a  >  0.  (a)  Apply  the  method  of  Exercise  2.3.20  to  solve  the  initial  value  problem 
for  the  damped  transport  equation :  ut  +  uux  +  au  =  0,  u(0,  x)  =  f(x). 

(b)  Does  the  damping  eliminate  shocks? 

2.3.22.  Apply  the  method  of  Exercise  2.3.20  to  solve  the  initial  value  problem 

2  1 


ut  +  tu 


X 


u 


u( 0,  x) 


1  +  x‘ 


2.4  The  Wave  Equation:  d’Alembert’s  Formula 


Newton’s  Second  Law  states  that  force  equals  mass  times  acceleration.  It  forms  the  bedrock 
underlying  the  derivation  of  mathematical  models  describing  all  of  classical  dynamics. 
When  applied  to  a  one-dimensional  medium,  such  as  the  transverse  displacements  of  a 
violin  string  or  the  longitudinal  motions  of  an  elastic  bar,  the  resulting  model  governing 
small  vibrations  is  the  second-order  partial  differential  equation 

" {x>  w  =  ik  ( Kix)  m)  ■  (2-65) 

Here  u(t,x)  represents  the  displacement  of  the  string  or  bar  at  time  t  and  position  x, 
while  p(x)  >  0  denotes  its  density  and  n(x)  >  0  its  stiffness  or  tension,  both  of  which  are 
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assumed  not  to  vary  with  t.  The  right-hand  side  of  the  equation  represents  the  restoring 
force  due  to  a  (small)  displacement  of  the  medium  from  its  equilibrium,  whereas  the  left- 
hand  side  is  the  product  of  mass  per  unit  length  and  acceleration.  A  correct  derivation  of 
the  model  from  first  principles  would  require  a  significant  detour,  and  we  refer  the  reader 
to  [120,  124]  for  the  details. 

We  will  simplify  the  general  model  by  assuming  that  the  underlying  medium  is  uni¬ 
form ,  and  so  both  its  density  p  and  stiffness  n  are  constant.  Then  (2.65)  reduces  to  the 
one-dimensional  wave  equation 


>  0 

is  known  as  the  wave  speed ,  for  reasons  that  will  soon  become  apparent. 

In  general,  to  uniquely  specify  the  solution  to  any  dynamical  system  arising  from 
Newton’s  Second  Law,  including  the  wave  equation  (2.66)  and  the  more  general  vibration 
equation  (2.65),  one  must  fix  both  its  initial  position  and  initial  velocity.  Thus,  the  initial 
conditions  take  the  form 

c)u 

u(0,x)  =  f(x),  —  (0  ,x)=g(x),  (2.67) 

where,  for  simplicity,  we  set  the  initial  time  t0  =  0.  (See  also  Exercise  2.4.6.)  The  initial 
value  problem  seeks  the  corresponding  C2  function  n(t,  x)  that  solves  the  wave  equation 
(2.66)  and  has  the  required  initial  values  (2.67).  In  this  section,  we  will  learn  how  to 
solve  the  initial  value  problem  on  the  entire  line  —  oo  <  x  <  oo.  The  analysis  of  the 
wave  equation  on  bounded  intervals  will  be  deferred  until  Chapters  4  and  7.  The  two- 
and  three-dimensional  versions  of  the  wave  equation  are  treated  in  Chapters  11  and  12, 
respectively. 


(2.66) 


d2 


u 


d 2 


u 


dt 2 


=  c 


dx2 


where  the  constant 


c  —  \  ,  — 


d’AlemberVs  Solution 

Let  us  now  derive  the  explicit  solution  formula  for  the  second-order  wave  equation  (2.66) 
first  found  by  d’Alembert.  The  starting  point  is  to  write  the  partial  differential  equation 
in  the  suggestive  form 

□  u  =  ( d 2  —  c2  d 2)  u  =  utt  —  c 2  uxx  =  0.  (2.68) 

Here 

□  =  d2  -  c2  d\ 

is  a  common  mathematical  notation  for  the  wave  operator ,  which  is  a  linear  second-order 
partial  differential  operator.  In  analogy  with  the  elementary  polynomial  factorization 

t2  —  c2  x2  =  (t  —  cx){t  +  cx), 

we  can  factor  the  wave  operator  into  a  product  of  two  first-order  partial  differential  oper¬ 
ators:^ 

□  =  dt  ~  c2  dl  =  (dt  ~cdx)  ( dt  +cdx ).  (2.69) 


T  The  cross  terms  cancel,  thanks  to  the  equality  of  mixed  partial  derivatives:  dtd xu  =  dxdtu. 
Constancy  of  the  wave  speed  c  is  essential  here. 


2.4  The  Wave  Equation:  d’Alembert’s  Formula 
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Now,  if  the  second  factor  annihilates  the  function  a(t,  x),  meaning 

(dt  +  c  dx)  u  =  ut  +  c  ux  =  0,  (2.70) 

then  u  is  automatically  a  solution  to  the  wave  equation,  since 

\3u  =  (dt-  cdx)  (dt  +  cdx)  u  =  (dt  ~cdx)  0  =  0. 

We  recognize  (2.70)  as  the  first-order  transport  equation  (2.4)  with  constant  wave  speed  c. 
Proposition  2.1  tells  us  that  its  solutions  are  traveling  waves  with  wave  speed  c: 

u(t,  x)  =  p(£)  =  p(x  —  ct),  (2.71) 

where  p  is  an  arbitrary  function  of  the  characteristic  variable  £  =  x  —  ct.  As  long  as 
p  £  C2  (i.e.,  is  twice  continuously  differentiable),  the  resulting  function  u(t,x)  is  a  classical 
solution  to  the  wave  equation  (2.66),  as  you  can  easily  check. 

Now,  the  factorization  (2.69)  can  equally  well  be  written  in  the  reverse  order: 

□  =  -  c2  d2x  =  (dt  +  cdx)  (dt  -cdx).  (2.72) 

The  same  argument  tells  us  that  any  solution  to  the  “backwards”  transport  equation 

ut  —  cux  —  0,  (2.73) 

with  constant  wave  speed  —  c,  also  provides  a  solution  to  the  wave  equation.  Again,  by 
Proposition  2.1,  with  c  replaced  by  —  c,  the  general  solution  to  (2.73)  has  the  form 

u(t,  x)  =  q(rj)  =  q[x  +  ct),  (2.74) 

where  q  is  an  arbitrary  function  of  the  alternative  characteristic  variable  rj  =  x  +  ct.  The 
solutions  (2.74)  represent  traveling  waves  moving  to  the  left  with  constant  speed  c  >  0. 
Provided  q  E  C2,  the  functions  (2.74)  will  provide  a  second  family  of  solutions  to  the  wave 
equation. 

We  conclude  that,  unlike  first-order  transport  equations,  the  wave  equation  (2.68) 
is  bidirectional  in  that  it  admits  both  left  and  right  traveling-wave  solutions.  Moreover, 
by  linearity  the  sum  of  any  two  solutions  is  again  a  solution,  and  so  we  can  immediately 
construct  solutions  that  are  superpositions  of  left  and  right  traveling  waves.  The  remarkable 
fact  is  that  every  solution  to  the  wave  equation  can  be  so  represented. 

Theorem  2.14.  Every  solution  to  the  wave  equation  (2.66)  can  be  written  as  a 
superposition , 

u(t,  x )  =  p(£)  +  q(rj)  =  p(x  —  ct)  +  q{x  +  ct),  (2.75) 

of  right  and  left  traveling  waves.  Here  p(£)  and  q(rj)  are  arbitrary  C2  functions ,  each 
depending  on  its  respective  characteristic  variable 

£  =  x  —  ct,  rj  =  x  -\-  ct.  (2.76) 

Proof :  As  in  our  treatment  of  the  transport  equation,  we  will  simplify  the  wave  equa¬ 
tion  through  an  inspired  change  of  variables.  In  this  case,  the  new  independent  variables 
are  the  characteristic  variables  £,77  defined  by  (2.76).  We  set 


v  —  £  v  +  £ 


u(t,x)  =  v(x  —  ct,x  +  ct)  =  v(£,r)),  whereby  v(£,r))=u 


2c 


1 


2 


.  (2.77) 
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Then,  employing  the  chain  rule  to  compute  the  partial  derivatives. 


du  (  dv  dv 

m=c'  ~^  + 


dt;  dp 


du  dv  dv 

dx  d£  dr]  ’ 


(2.78) 


and,  further, 
d2u 


dt 2 

Therefore 


c 


o  z  d2v  d2v  d2v 
21  -  2 


d? 


d^dp 


2  /  5 


<9277  <92t;  <92t;  <92t; 

<9x2  <9£2  d^dp  dr]2 


\Ju  = 


d2 


u 


d2 


u 


dt 2 


—  c* 


dx 2 


=  -4cz 


d2 


v 


d £  dr] 


(2.79) 


We  conclude  that  u(t,x)  solves  the  wave  equation  \Z\u  =  0  if  and  only  if  v(£,p)  solves  the 
second-order  partial  differential  equation 


d2v 
d '£  dr] 


=  0. 


which  we  write  in  the  form 

d  f  dv 


<9£  \  dr] 


dw 

dt 


=  0. 


where 


w  = 


dv 

dr] 


Thus,  applying  the  methods  of  Section  2.1  (and  making  the  appropriate  assumptions  on 
the  domain  of  definition  of  re),  we  deduce  that 


dv  1  \ 
w  =  —  =  ryp) 


dr] 

where  r  is  an  arbitrary  function  of  the  characteristic  variable  77.  Integrating  both  sides  of 
the  latter  partial  differential  equation  with  respect  to  77,  we  find 


v(t,v)  =  p(0  +q(v) 


where 


l(v)  =  /  r(rj)  drj, 


while  p(£)  represents  the  77  integration  “constant” .  Replacing  the  characteristic  variables 
by  their  formulas  in  terms  of  t  and  x  completes  the  proof.  Q.E.D. 

Let  us  see  how  the  solution  formula  (2.75)  can  be  used  to  solve  the  initial  value  problem 
(2.67).  Substituting  into  the  initial  conditions,  we  deduce  that 


7/(0,  x)  =  p{x)  +  q{x)  =  f{x) 


du 

dt 


(0,  x)  =  —  cp\x )  +  c  q\x)  =  g{x).  (2.80) 


To  solve  this  pair  of  equations  for  the  functions  p  and  q ,  we  differentiate  the  first, 

p\x)  +  q'(x)  =  f(x), 

and  then  subtract  off  the  second  equation  divided  by  c;  the  result  is 


Therefore. 


2p'(x)  =  f'(x) - g(x). 

c 


l  1  rx 

P(x)  =  2  /' (x)  “  Yc  J  9\z)dz  +  a- 
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where  a  is  an  integration  constant.  The  first  equation  in  (2.80)  then  yields 


1  1 

q(x)  =  f(x)  -  p(x)  =  -  /( x)  +  —  /  g(z)  dz  -  a. 

2  2  c 

Substituting  these  two  expressions  back  into  our  solution  formula  (2.75),  we  obtain 

u(t,  x)=  p(£)  +  q(rj)  =  ^  ^  -  2-  f  g(z)  dz  +  f-  f  g(z)  dz 

2  2  c  Jo  2  c  Jo 


/(£)  +  /Q)  |  1 


■v 


2c 


e 


g(z)dz, 


where  <^,77  are  the  characteristic  variables  (2.76).  In  this  manner,  we  have  arrived  at 
d'Alembert’s  solution  to  the  initial  value  problem  for  the  wave  equation  on  the  real  line. 

Theorem  2.15.  The  solution  to  the  initial  value  problem 


d2u 


d2u 


=  £ 


dt 2  dx 2 

is  given  by 


u{ 0,  x)  =  f(x) 


du 

dt 


(0,  x)  =  g(x), 


—00  <  X  <  00. 


(2.81) 


u(t,  x)  = 


f(x  - 


ct)  +  f(x  +  ct)  +  y  fx+ct g{z)dZ' 


2c 


(2.82) 


x  —  ct 


Remark :  In  order  that  (2.82)  define  a  classical  solution  to  the  wave  equation,  we 
need  /  G  C2  and  g  G  C1.  However,  the  formula  itself  makes  sense  for  more  general 
initial  conditions.  We  will  continue  to  treat  the  resulting  functions  as  solutions,  albeit 
nonclassical,  since  they  fit  under  the  more  general  rubric  of  “weak  solution” ,  to  be  developed 
in  Section  10.4. 


Example  2.16.  Suppose  there  is  no  initial  velocity,  so  g(x)  =  0,  and  hence  the 
motion  is  purely  the  result  of  the  initial  displacement  u(0,x)  =  f(x).  In  this  case,  (2.82) 
reduces  to 


u(t,  x)  =  |  f{x  —  ct )  +  |  f{x  +  ct). 


(2.83) 
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A  A 

JV. 

J\ 

t  =  0 

t  =  1 

t  =  2 

t  =  3 

t  =  4 

t  —  5 

Figure  2.25.  Interaction  of  waves.  [+j 


The  effect  is  that  the  initial  displacement  splits  into  two  waves,  one  moving  to  the  right 
and  the  other  moving  to  the  left,  each  of  constant  speed  c,  and  each  of  exactly  the  same 
shape  as  /(x),  but  only  half  as  tall.  For  example,  if  the  initial  displacement  is  a  localized 
pulse  centered  at  the  origin,  say 


n(0,  x)  =  e 


du 

dt 


(0,  x)  =  0, 


then  the  solution 

u(t,x)  =  \e-{x~ctf  +  \e~{x+ct^ 

consists  of  two  half  size  pulses  running  away  from  the  origin  with  the  same  speed  c,  but 
in  opposite  directions.  A  graph  of  the  solution  at  several  successive  times  can  be  seen  in 
Figure  2.24. 

If  we  take  two  initially  separated  pulses,  say 


u( 0,x)  =  e  x  +2e  ^  ,  —  (0,x)  =  0, 

centered  at  x  =  0  and  x  =  1,  then  the  solution 

—  1  g-  ( x-ct )2  _|_  e~  ( x-l-ct )2  -\-  l  e~  (x+ct)2  _|_  e~  ( x-l+ct )2 

will  consist  of  four  pulses,  two  moving  to  the  right  and  two  to  the  left,  all  with  the  same 
speed.  An  important  observation  is  that  when  a  right-moving  pulse  collides  with  a  left- 
moving  pulse,  they  emerge  from  the  collision  unchanged,  which  is  a  consequence  of  the 
inherent  linearity  of  the  wave  equation.  In  Figure  2.25,  the  first  picture  plots  the  initial 
displacement.  In  the  second  and  third  pictures,  the  two  localized  bumps  have  each  split  into 
two  copies  moving  in  opposite  directions.  In  the  fourth  and  fifth,  the  larger  right-moving 
bump  is  in  the  process  of  interacting  with  the  smaller  left-moving  bump.  Finally,  in  the 
last  picture  the  interaction  is  complete,  and  the  individual  pairs  of  left-  and  right-moving 
waves  move  off  in  tandem  in  opposing  directions,  experiencing  no  further  collisions. 


In  general,  if  the  initial  displacement  is  localized,  so  that  |  f(x)  \  <C  1  for  |  x  \  0,  then, 

after  a  finite  time,  the  left-  and  right-moving  waves  will  separate,  and  the  observer  will  see 
two  half-size  replicas  running  away,  with  speed  c,  in  opposite  directions.  If  the  displacement 
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Figure  2.26.  The  error  function  erf  x. 


is  not  localized,  then  the  left  and  right  traveling  waves  will  never  fully  disengage,  and  one 
might  be  hard  pressed  to  recognize  that  a  complicated  solution  pattern  is,  in  reality,  just 
the  superposition  of  two  simple  traveling  waves.  For  example,  consider  the  elementary 
trigonometric  solution 


cosct  cosx  =  |  cos(x  —  ct)  +  |  cos(x  +  ct).  [+j  (2.84) 

In  accordance  with  the  left-hand  expression,  an  observer  will  see  a  standing  cosinusoidal 
wave  that  vibrates  up  and  down  with  frequency  c.  However,  the  d’Alembert  form  of  the 
solution  on  the  right-hand  side  says  that  this  is  just  the  sum  of  left-  and  right-traveling 
cosine  waves!  The  interactions  of  their  peaks  and  troughs  reproduce  the  standing  wave. 
Thus,  the  same  solution  can  be  interpreted  in  two  seemingly  incompatible  ways.  And, 
in  fact,  this  paradox  lies  at  the  heart  of  the  perplexing  wave-particle  duality  of  quantum 
physics. 


Example  2.17.  By  way  of  contrast,  suppose  there  is  no  initial  displacement,  so 
f(x)  =  0,  and  the  motion  is  purely  the  result  of  the  initial  velocity  ut(0,x)  =  g(x). 
Physically,  this  models  a  violin  string  at  rest  being  struck  by  a  “hammer  blow”  at  the 
initial  time.  In  this  case,  the  d’Alembert  formula  (2.82)  reduces  to 


u(t,  x) 


1 


2c 


* x-\-ct 

x  —  ct 
2 


g(z)  dz. 


(2.85) 


For  example,  when  u( 0,x)  =  0,  ut( 0,x)  =  e  x  ,  the  resulting  solution  (2.85)  is 


u(£,  x) 


1 


2c 


* x-\-ct 


x  —  ct 


2  \/7T  r 

x  dz  =  ^~ 
4c  L 


erf(x  +  ct )  —  erf(x  —  ct ) 


(2.86) 


where 


* X 


erf  x  = 


z  dz 


(2.87) 


o 


is  known  as  the  error  function  due  to  its  many  applications  throughout  probability  and 
statistics,  [39].  The  error  function  integral  cannot  be  written  in  terms  of  elementary 
functions;  nevertheless,  its  properties  have  been  well  studied  and  its  values  tabulated, 
[86].  A  graph  appears  in  Figure  2.26.  The  constant  in  front  of  the  integral  (2.87)  has  been 
chosen  so  that  the  error  function  has  asymptotic  values 


lim  erf  x  =  1 , 


lim  erf  x  =  —  1, 


X  — r  OO 


X  — >  —  OO 


(2.88) 
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which  follow  from  a  well-known  integration  formula  to  be  derived  in  Exercise  2.4.21. 

A  graph  of  the  solution  (2.86)  at  successive  times  is  displayed  in  Figure  2.27.  The 
first  graph  shows  the  zero  initial  displacement.  Gradually,  the  effect  of  the  initial  hammer 
blow  is  felt  further  and  further  away  along  the  string,  as  the  two  wave  fronts  propagate 
away  from  the  origin,  both  with  speed  c,  but  in  opposite  directions.  Thus,  unlike  the  case 
of  a  nonzero  initial  displacement  in  Figure  2.24,  where  the  solution  eventually  returns  to 
its  equilibrium  position  u  =  0  after  the  wave  passes  by,  a  nonzero  initial  velocity  leaves  the 
string  permanently  deformed. 

In  general,  the  lines  of  slope  ±c,  where  the  respective  characteristic  variables  are 
constant, 

£  =  x  —  ct  =  a,  rj  =  x  +  ct  =  b.  (2.89) 

are  known  as  the  characteristics  of  the  wave  equation.  Thus,  the  second-order  wave  equa¬ 
tion  has  two  distinct  characteristic  lines  passing  through  each  point  in  the  (£,  x)-plane. 

Remark :  The  characteristic  lines  are  the  one-dimensional  counterparts  of  the  light 
cone  in  Minkowski  space-time,  which  plays  a  starring  role  in  special  relativity,  [70,  75]. 
See  Section  12.5  for  further  details. 

In  Figure  2.28,  we  plot  the  two  characteristics  going  through  a  point  (0,  y)  on  the  x 
axis.  The  wedge-shaped  region  {y  —  ct  <  x  <  y  +  ct,  £>0}  lying  between  them  is  known 
as  the  domain  of  influence  of  the  point  (0,  y),  since,  in  general,  the  value  of  the  initial  data 
at  a  point  will  affect  the  subsequent  solution  values  only  in  its  domain  of  influence.  Indeed, 
the  effect  of  an  initial  displacement  at  the  point  y  propagates  along  the  two  characteristic 
lines,  while  the  effect  of  an  initial  velocity  there  will  be  felt  at  every  point  in  the  triangular 
wedge. 


External  Forcing  and  Resonance 


When  a  homogeneous  vibrating  medium  is  subjected  to  external  forcing,  the  wave  equation 
acquires  an  additional,  inhomogeneous  term: 


d2u 
dt 2 


d2u 
dx 2 


+  F(t,  x), 


(2.90) 
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Figure  2.28.  Characteristic  lines  and  domain  of  influence. 


in  which  F(£,  x)  represents  a  force  imposed  at  time  t  and  spatial  position  x.  With  a  bit 
more  work,  d’Alembert’s  solution  technique  can  be  readily  adapted  to  incorporate  the 
forcing  term. 

Let  us,  for  simplicity,  assume  that  the  differential  equation  is  supplemented  by  homo¬ 
geneous  initial  conditions, 


n(0,  x)  =  0. 


Ut{  0,x)  =  0, 


(2.91) 


meaning  that  there  is  no  initial  displacement  or  velocity.  To  solve  the  initial  value  problem 
(2.90-91),  we  switch  to  the  same  characteristic  coordinates  (2.76),  setting 


v&v)  =  u 


r)~tj  V  +  t, 

2c  ’  2 


Invoking  the  chain  rule  formulas  (2.79),  we  find  that  the  forced  equation  (2.90)  becomes 


d2v 


1  F fv-£ 


(2.92) 


dt;  drj  4  c2  \  2  c  1  2 

Let  us  integrate  both  sides  of  the  equation  with  respect  to  77,  on  the  interval  t<Q<w- 


di 

But,  recalling  (2.78) 


dv  dv  1 

(c^)-A(ce  =  - 


4ff 


£ 


(2.93) 


di 


and  so,  in  particular. 


dv  ^  ^  =  1  du  ( r)  -  £  |  1  du  fr)-£  77  +  £ 


2  c  dt  \  2  c 


2  dx  V  2  c 


dv  ^  1  du  ^  1  du 

(ce  =  ^-^-(0,0  + o  ^(0,0  =  0, 


2  c  dt 


2  dx 


which  vanishes  owing  to  our  choice  of  homogeneous  initial  conditions  (2.91).  Indeed,  the 
initial  velocity  condition  says  that  ut( 0,  x)  =  0,  while  differentiating  the  initial  displacement 
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condition  77(0,  x)  =  0  with  respect  to  x  implies  that  ux( 0,  x)  =  0  for  all  x,  including  x  = 
As  a  result,  (2.93)  simplifies  to 


dv  <c  \  1 

(Cy)  =  - 


<9£w'"  4c2  h  '  V  2c  7  2 

We  now  integrate  the  latter  equation  with  respect  to  £  on  the  interval  £  <  %  <  77,  producing 

1 


-  ac  v)  =  v(y,  rj)  -  v (£,  ??)  =  - 


4c2 


since  77(77,  77)  =  77(0, 77)  =  0,  thanks  again  to  the  initial  conditions.  In  this  manner,  we 
have  produced  an  explicit  formula  for  the  solution  to  the  characteristic  variable  version  of 
the  forced  wave  equation  subject  to  the  homogeneous  initial  conditions.  Reverting  to  the 
original  physical  coordinates,  the  left-hand  side  of  this  equation  becomes  —  u(t,x).  As  for 
the  double  integral  on  the  right-hand  side,  it  takes  place  over  the  triangular  region 

T(C  y)  =  { (x,  0  I  £  <  x  <  C  <  y }  •  (2-94) 

Let  us  introduce  “physical”  integration  variables  by  setting 

x  =  y-cs,  C,=y  +  cs. 

The  defining  inequalities  of  the  triangle  (2.94)  become 

x  —  ct  <  y  —  cs<7/  +  cs<x  +  ct, 

and  so,  in  the  physical  coordinates,  the  triangular  integration  domain  assumes  the  form 

ZJ(£,  x)  =  {  (s,  y)  |  x  —  c  (t  —  s)  <  y  <  x  +  c  (t  —  s),  0  <  s  <  t  }  ,  (2.95) 

which  is  graphed  in  Figure  2.29.  The  change  of  variables  formula  for  double  integrals 
requires  that  we  compute  the  Jacobian  determinant 


det  (aX'at  ^Vdet 

V  dC/dy  dC/ds 


1  —  c 

1  c 


=  2  c. 


and  so  d\  d(  =  2  c  ds  dy.  Therefore 

1 


u(t,  x) 


2c 


F(s,  y)  ds  dy  = 


2c 


x+c  ( t  —  s ) 


F(s,  y)  dy  ds, 


(2.96) 


x  —  c ( t  —  s ) 


D(t,x ) 

which  gives  the  solution  formula  for  the  forced  wave  equation  when  subject  to  homogeneous 
initial  conditions. 

To  solve  the  general  initial  value  problem,  we  appeal  to  linear  superposition,  writing  its 
solution  as  a  sum  of  the  solution  (2.96)  to  the  forced  wave  equation  subject  to  homogeneous 
initial  conditions  plus  the  d’Alembert  solution  (2.82)  to  the  unforced  equation  subject  to 
inhomogeneous  boundary  conditions. 

Theorem  2.18.  The  solution  to  the  general  initial  value  problem 


u 


tt 


c2^  +  F(f  i),  tx(0,  x)  =  /(x),  77t(0,  x)  =  g(x),  —  oo  <  x  <  oo,  £>0. 


for  the  wave  equation  subject  to  an  external  forcing  is  given  by 


f(x  -  ct)  +  f(x  +  ct)  1 

U(t,  X)  = - b 


2c 


* x-\-ct  ^ 

g(y)  dy  + 


t  px-\ -c{t—s) 


x  —  ct 


2c 


F(s,  y)  dy  ds. 


x  —  c  (t  —  s) 


(2.97) 
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Observe  that  the  solution  is  a  linear  superposition  of  the  respective  effects  of  the  initial 
displacement,  the  initial  velocity,  and  the  external  forcing.  The  triangular  integration 
region  (2.95),  lying  between  the  x-axis  and  the  characteristic  lines  going  backwards  from 
(£,x),  is  known  as  the  domain  of  dependence  of  the  point  (£,x).  This  is  because,  for  any 
t  >  0,  the  solution  value  u(£,  x)  depends  only  on  the  values  of  the  initial  data  and  the 
forcing  function  at  points  lying  within  the  domain  of  dependence  D{t,x).  Indeed,  the  first 
term  in  the  solution  formula  (2.97)  requires  only  the  initial  displacement  at  the  corners 
(0,x  +  ct),  (0,x  —  ct);  the  second  term  requires  only  the  initial  velocity  at  points  on  the 
x-axis  lying  on  the  vertical  side  of  D(t,  x);  while  the  final  term  requires  the  value  of  the 
external  force  on  the  entire  triangular  region. 

Example  2.19.  Let  us  solve  the  initial  value  problem 


utt  =  uxx  +  sincet  sinx. 


u(0,  x)  =  0,  ut( 0,  x)  =  0, 


for  the  wave  equation  with  unit  wave  speed  subject  to  a  sinusoidal  forcing  function  whose 
amplitude  varies  periodically  in  time  with  frequency  oo  >  0.  According  to  formula  (2.96), 
the  solution  is 


x-\-t  —  s 


u(t,x)  =  - 

_  1 

“  2 


sin  oos  sin  y  dy  ds 


cos(x  —  t  +  s)  —  cos(x  +  t  —  s)  ds 


(  sin  uo  t  —  uo  sin  t 


< 


1  —  UJ2 

sin  t  —  t  cos  t 


smx. 


smx. 


0  <  UJ  7^  1 


00  =  1. 


lil 


Notice  that,  when  oo  ^  1,  the  solution  is  bounded,  being  a  combination  of  two  vibrational 
modes:  an  externally  induced  mode  at  frequency  oo  along  with  an  internal  mode,  at  fre¬ 
quency  1.  Ifce=p/g^lisa  rational  number,  then  the  solution  varies  periodically  in 
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cos  t  +  cos  y/5 t 

Figure  2.30.  Periodic  and  quasiperiodic  functions. 


time.  On  the  other  hand,  if  uj  is  irrational,  then  the  solution  is  only  quasiperiodic ,  and  never 
exactly  repeats  itself.  Finally,  if  uj  =  1,  the  solution  grows  without  limit  as  t  increases, 
indicating  that  this  is  a  resonant  frequency.  We  will  investigate  external  forcing  and  the 
mechanisms  leading  to  resonance  in  dynamical  partial  differential  equations  in  more  detail 
in  Chapters  4  and  6. 

Example  2.20.  To  appreciate  the  difference  between  periodic  and  quasiperiodic 
vibrations,  consider  the  elementary  trigonometric  function 

u(t)  =  cost  +  coscet, 

which  is  a  linear  combination  of  two  simple  periodic  vibrations,  of  frequencies  1  and  c u.  If 
uj  =  p/q  is  a  rational  number,  then  u{t)  is  a  periodic  function  of  period  2i rg,  so  u(t  +  2nq)  = 
u(t).  However,  if  uj  is  an  irrational  number,  then  u(t)  is  not  periodic,  and  never  repeats. 
You  are  encouraged  to  inspect  the  graphs  in  Figure  2.30.  The  first  is  periodic  —  can  you 
spot  where  it  begins  to  repeat?  —  whereas  the  second  is  only  quasiperiodic.  The  only 
quasiperiodic  functions  we  will  encounter  in  this  text  are  linear  combinations  of  periodic 
trigonometric  functions  whose  frequencies  are  not  all  rational  multiples  of  each  other.  To 
the  uninitiated,  such  quasiperiodic  motions  may  appear  to  be  random,  even  though  they  are 
built  from  a  few  simple  periodic  constituents.  While  ostensibly  complicated,  quasiperiodic 
motion  is  not  true  chaos,  which  is  is  an  inherently  nonlinear  phenomenon,  [77]. 


Exercises 


2.4.1.  Solve  the  initial  value  problem  utt  =  c  uxx,  u(0,x)  =  e  ,  ut(0,x)  =  sinx. 

2.4.2.  (a)  Solve  the  wave  equation  utt  =  uxx  when  the  initial  displacement  is  the  box  function 

1,  1  <  x  <  2, 

0,  otherwise, 

(b)  Sketch  the  resulting  solution  at  several  representative  times. 


u( 0,  x)  = 


while  the  initial  velocity  is  0. 


2.4  The  Wave  Equation:  d’Alembert’s  Formula 
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is  a  step  function  cr(x) 


2.4.3.  Answer  Exercise  2.4.2  when  the  initial  velocity  is  the  box  function,  while  the  initial  dis¬ 
placement  is  zero. 

2.4.4.  Write  the  following  solutions  to  the  wave  equation  utt  =  uxx  in  d’Alembert  form  (2.82). 
Hint :  What  is  the  appropriate  initial  data? 

(a)  cos x cost,  (b)  cos2x  sin2t,  (c)  exJrt ,  (d)  t2 -\- x2 ,  (e)  t3+3tx2. 

T  2.4.5.  (a)  Solve  the  dam  break  problem ,  that  is,  the  wave  equation  when  the  initial  displacement 

1  x  0 

and  there  is  no  initial  velocity,  (b)  Analyze  the 

U  ^  x  ^  U  ^ 

case  in  which  there  is  no  initial  displacement,  while  the  initial  velocity  is  a  step  function. 

(c)  Are  your  solutions  classical  solutions?  Explain  your  answer,  (d)  Prove  that  the  step 

1  _i  1 

function  is  the  limit,  as  n  ^  oo,  of  the  functions  fn(x)  =  —  tan  nx  +  -  .  (e)  Show  that, 

7 T  Z 

in  both  cases,  the  step  function  solution  can  be  realized  as  the  limit,  as  n  — >>  oo,  of  solutions 
to  the  initial  value  problems  with  the  functions  fn(x)  as  initial  displacement  or  velocity. 

0  2.4.6.  Suppose  u(t,x)  solves  the  initial  value  problem  u(0,  x)  =  /(#),  ut( 0,  x)  =  g(x),  for  the 

wave  equation  (2.66).  Prove  that  the  solution  to  the  initial  value  problem  u(tQ,x)  =  /(#), 
ut(t0lx)  =  g(x),  is  u(t  —  t0,  x). 

2.4.7.  Find  all  resonant  frequencies  for  the  wave  equation  with  wave  speed  c  when  subject  to 
the  external  forcing  function  F(t,  x)  =  sin  cat  sin  kx  for  fixed  ce,  k  >  0. 

2.4.8.  Consider  the  initial  value  problem  utt  =  4 uxx  +  F(t,x),  u(0,x)  =  /(#),  ut(0,x)  =  g(x). 
Determine  (a)  the  domain  of  influence  of  the  point  (0,  2);  (b)  the  domain  of  dependence  of 
the  point  (3,  —1);  (c)  the  domain  of  influence  of  the  point  (3,  —1). 

2.4.9.  (a)  A  solution  to  the  wave  equation  utt  =  2 uxx  is  generated  by  a  displacement  concen¬ 
trated  at  position  x0  =  1  and  time  t0  =  0,  but  no  initial  velocity.  At  what  time  will  an 
observer  at  position  x1  =  5  feel  the  effect  of  this  displacement?  Will  the  observer  continue 
to  feel  an  effect  in  the  future?  (b)  Answer  part  (a)  when  there  is  an  initial  velocity  concen¬ 
trated  at  position  x0  =  1  and  time  t0  =  0,  but  no  initial  displacement. 

2.4.10.  Suppose  u(t,x)  solves  the  initial  value  problem  utt  =  4 uxx  +  sine at  cosx,  u(0,  x)  =  0, 
ut( 0,  x)  =  0.  Is  h(t)  =  u(t,  0)  a  periodic  function? 

C  2.4.11.  (a)  Write  down  an  explicit  formula  for  the  solution  to  the  initial  value  problem 


d2u 


d2u 


0,  u(0,  x)  =  since, 


du 


(0,  x)  =  cos  x, 


dt2  dx 2  '  1  dt 

(b)  True  or  false:  The  solution  is  a  periodic  function  of  t. 

(c)  Now  solve  the  forced  initial  value  problem 

d2u  .  d2u  „  du 


oo  <  x  <  oo,  t  >  0. 


dt2  4^2=cos2t’  M(0,  x)  =  sin„,  gt 

(d)  True  or  false:  The  forced  equation  exhibits  resonance.  Explain. 

(e)  Does  the  answer  to  part  (d)  change  if  the  forcing  function  is  sin2£? 

2.4.12.  Given  a  classical  solution  u(t,x)  of  the  wave  equation,  let  E  =  \  (u2  +  c2n2)  be  the 
associated  energy  density  and  P  =  utux  the  momentum  density. 

(a)  Show  that  both  E  and  P  are  conserved  densities  for  the  wave  equation. 

(b)  Show  that  E(t,x)  and  P(t,x)  both  satisfy  the  wave  equation. 


x. 


(0,  x) 


COS  X. 


OO  <  X  <  oo,  t  >  0. 


^  2.4.13.  Let  u(t,x)  be  a  classical  solution  to  the  wave  equation  u 


E(t)  ,  / 


OO  \ 

—  oo  2 


du 

dt 


+  C 


tt 

du  x  2 


c2u 


dx 


dx 


xx.  The  total  energy 

(2.98) 


represents  the  sum  of  kinetic  and  potential  energies  of  the  displacement  u(t,  x)  at  time  t. 
Suppose  that  Vu  0  sufficiently  rapidly  as  x  Too;  more  precisely,  one  can  find  a  >  \ 
and  C(t)  >0  such  that  |  ut(t,  x)  |,  |  ux(t,  x)  \  <  C(t)/\  x  \a  for  each  fixed  t  and  all  sufficiently 

large  \  x\  ^>0.  For  such  solutions,  establish  the  Law  of  Conservation  of  Energy  by  showing 
that  E{t)  is  finite  and  constant.  Hint:  You  do  not  need  the  formula  for  the  solution. 
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2  Linear  and  Nonlinear  Waves 


0  2.4.14.  (a)  Use  Exercise  2.4.13  to  prove  that  the  only  classical  solution  to  the  initial-boundary 
value  problem  utt  =  c2uxxl  u(0,x)  =  0,  ut(0,x)  =  0,  satisfying  the  indicated  decay  assump¬ 
tions  is  the  trivial  solution  u(t,x)  =  0.  (b)  Establish  the  following  Uniqueness  Theorem  for 
the  wave  equation:  there  is  at  most  one  such  solution  to  the  initial-boundary  value  problem 

utt  =  c  uxx,  u(0,x)  =  f(x),  ut(0,x)  =  g(x). 

o 

2.4.15.  The  telegrapher’s  equation  utt  +  aut  =  c  uxx ,  with  a  >  0,  models  the  vibration  of 

a  string  under  frictional  damping,  (a)  Show  that,  under  the  decay  assumptions  of  Exer¬ 
cise  2.4.13,  the  wave  energy  (2.98)  of  a  classical  solution  is  a  nonincreasing  function  of  t. 

(b)  Prove  uniqueness  of  such  solutions  to  the  initial  value  problem  for  the  telegrapher’s 
equation. 


2.4.16.  What  happens  to  the  proof  of  Theorem  2.14  if  c  =  0? 

2.4.17.  (a)  Explain  why  the  d’Alembert  factorization  method  doesn’t  work  when  the  wave  speed 
c{x)  depends  on  the  spatial  variable  x. 

(b)  Does  it  work  when  c(t)  depends  only  on  the  time  t? 


2.4.18.  The  Poisson-Darboux  equation  is 


d2u  d2u  2  du 
dU  dx 2  x  dx 


0.  Solve  the  initial  value  problem 


u( 0,  x)  =  0,  uA 0,  x)  =  g(x),  where  g(x)  =  g{—x)  is  an  even  function.  Hint :  Set  w  =  xu. 


T  2.4.19.  (a)  Solve  the  initial  value  problem  utt  —  2 utx  —  3 uxx  =  0,  a(0,  x)  =  x2 ,  at(0,  x)  =  ex . 

Hint :  Factor  the  associated  linear  differential  operator,  (b)  Determine  the  domain  of  influ¬ 
ence  of  a  point  (0,  x).  (c)  Determine  the  domain  of  dependence  of  a  point  (t,  x)  with  t  >  0. 


0  2.4.20.  (a)  Use  polar  coordinates  to  prove  that,  for  any  a  >  0, 

e~a(x  +y  )  dx  fj[y  =  1L  t 

y  a 

(b)  Explain  why 

X)  2 

—  ax  7 

e  dx  = 

oo 


(2.99) 

(2.100) 


<0  2.4.21.  Use  Exercise  2.4.20  to  prove  the  error  function  formulae  (2.88). 


Chapter  3 
Fourier  Series 


Just  before  1800,  the  French  mathematician/physicist /engineer  Jean  Baptiste  Joseph 
Fourier  made  an  astonishing  discovery,  [42].  Through  his  deep  analytical  investigations 
into  the  partial  differential  equations  modeling  heat  propagation  in  bodies,  Fourier  was 
led  to  claim  that  “every”  function  could  be  represented  as  an  infinite  series  of  elementary 
trigonometric  functions:  sines  and  cosines.  For  example,  consider  the  sound  produced  by 
a  musical  instrument,  e.g.,  piano,  violin,  trumpet,  or  drum.  Decomposing  the  signal  into 
its  trigonometric  constituents  reveals  the  fundamental  frequencies  (tones,  overtones,  etc.) 
that  combine  to  produce  the  instrument’s  distinctive  timbre.  This  Fourier  decomposition 
lies  at  the  heart  of  modern  electronic  music;  a  synthesizer  combines  pure  sine  and  cosine 
tones  to  reproduce  the  diverse  sounds  of  instruments,  both  natural  and  artificial,  according 
to  Fourier’s  general  prescription. 

Fourier’s  claim  was  so  remarkable  and  counterintuitive  that  most  of  the  leading  math¬ 
ematicians  of  the  time  did  not  believe  him.  Nevertheless,  it  was  not  long  before  scientists 
came  to  appreciate  the  power  and  far-ranging  applicability  of  Fourier’s  method,  thereby 
opening  up  vast  new  realms  of  mathematics,  physics,  engineering,  and  beyond.  Indeed, 
Fourier’s  discovery  easily  ranks  in  the  “top  ten”  mathematical  advances  of  all  time,  a  list 
that  would  also  include  Newton’s  invention  of  the  calculus,  and  Gauss  and  Riemann’s 
differential  geometry,  which,  70  years  later,  became  the  foundation  of  Einstein’s  general 
relativity.  Fourier  analysis  is  an  essential  component  of  much  of  modern  applied  (and  pure) 
mathematics.  It  forms  an  exceptionally  powerful  analytic  tool  for  solving  a  broad  range  of 
linear  partial  differential  equations.  Applications  in  physics,  engineering,  biology,  finance, 
etc.,  are  almost  too  numerous  to  catalogue:  typing  the  word  “Fourier”  in  the  subject  index 
of  a  modern  science  library  will  dramatically  demonstrate  just  how  ubiquitous  these  meth¬ 
ods  are.  Fourier  analysis  lies  at  the  heart  of  signal  processing,  including  audio,  speech, 
images,  videos,  seismic  data,  radio  transmissions,  and  so  on.  Many  modern  technologi¬ 
cal  advances,  including  television,  music  CDs  and  DVDs,  cell  phones,  movies,  computer 
graphics,  image  processing,  and  fingerprint  analysis  and  storage,  are,  in  one  way  or  another, 
founded  on  the  many  ramifications  of  Fourier  theory.  In  your  career  as  a  mathematician, 
scientist,  or  engineer,  you  will  find  that  Fourier  theory,  like  calculus  and  linear  algebra,  is 
one  of  the  most  basic  weapons  in  your  mathematical  arsenal.  Mastery  of  the  subject  is 
essential. 

Furthermore,  a  surprisingly  large  fraction  of  modern  mathematics  rests  on  subsequent 
attempts  to  place  Fourier  series  on  a  firm  mathematical  foundation.  Thus,  many  of  modern 
analysis’  most  basic  concepts,  including  the  definition  of  a  function,  the  e-5  definition 
of  limit  and  continuity,  convergence  properties  in  function  space,  the  modern  theory  of 
integration  and  measure,  generalized  functions  such  as  the  delta  function,  and  many  others, 
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3  Fourier  Series 


all  owe  a  profound  debt  to  the  prolonged  struggle  to  establish  a  rigorous  framework  for 
Fourier  analysis.  Even  more  remarkably,  modern  set  theory,  and,  thus,  the  foundations 
of  modern  mathematics  and  logic,  can  be  traced  directly  back  to  the  nineteenth-century 
German  mathematician  Georg  Cantor’s  attempts  to  understand  the  sets  on  which  Fourier 
series  converge! 

We  begin  our  development  of  Fourier  methods  by  explaining  why  Fourier  series  nat¬ 
urally  appear  when  we  try  to  solve  the  one-dimensional  heat  equation.  The  reader  unin¬ 
terested  in  such  motivations  can  safely  omit  this  initial  section,  since  the  same  material 
reappears  in  Chapter  4,  where  we  apply  Fourier  methods  to  solve  several  important  linear 
partial  differential  equations.  Beginning  in  Section  3.2,  we  shall  introduce  the  most  basic 
computational  techniques  for  Fourier  series.  The  final  section  is  an  abbreviated  introduc¬ 
tion  to  the  analytic  background  required  to  develop  a  rigorous  foundation  for  Fourier  series 
methods.  While  this  section  is  a  bit  more  mathematically  sophisticated  than  what  has  ap¬ 
peared  so  far,  the  student  is  strongly  encouraged  to  delve  into  it  to  gain  additional  insight 
and  see  further  developments,  including  some  of  direct  importance  in  applications. 


3.1  Eigensolutions  of  Linear  Evolution  Equations 


Following  our  studies  of  first-order  partial  differential  equations  in  Chapter  2,  the  next 
important  example  to  merit  investigation  is  the  second-order  linear  equation 


du  d2u 
dt  dx 2  ’ 


known  as  the  heat  equation ,  since  it  models  (among  other  diffusion  processes)  heat  flow 
in  a  one-dimensional  medium,  e.g.,  a  metal  bar.  For  simplicity,  we  have  set  the  physical 
parameters  equal  to  1  in  order  to  focus  on  the  solution  techniques.  A  more  complete 
discussion,  including  a  brief  derivation  from  physical  principles,  will  appear  in  Chapter  4. 
Unlike  the  wave  equation  considered  in  Chapter  2,  there  is  no  comparably  elementary 
formula  for  the  general  solution  to  the  heat  equation.  Instead,  we  will  write  solutions 
as  infinite  series  in  certain  simple,  explicit  solutions.  This  solution  method,  pioneered  by 
Fourier,  will  lead  us  immediately  to  the  definition  of  a  Fourier  series.  The  remainder  of  this 
chapter  will  be  devoted  to  developing  the  basic  properties  and  calculus  of  Fourier  series. 
Once  we  have  mastered  these  essential  mathematical  techniques,  we  will  start  applying 
them  to  partial  differential  equations  in  Chapter  4. 


Let  us  begin  by  writing  the  heat  equation  (3.1)  in  a  more  abstract,  but  suggestive, 


linear  evolutionary  form 

du  r  n 

(3.2) 

at  = 

in  which 

d2u 

(3.3) 

L  ^  “  dx 2 

is  a  linear  second-order  differential  operator.  Recall,  (1.11),  that  linearity  imposes  two 
requirements  on  the  operator  L : 


L[cu 


L[u  +  v\  =  L[u]  +  L[u], 


(3.4) 


3.1  Eigensolutions  of  Linear  Evolution  Equations 
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for  any  functions^  u,  v  and  any  constant  c.  Moreover,  since  L  involves  differentiation  only 
with  respect  to  x,  it  also  satisfies 


L[c(t )  u\  =  c(t )  L[u 


(3.5) 


for  any  function  c(t)  that  does  not  depend  on  x. 

Of  course,  there  are  many  other  possible  linear  differential  operators,  and  so  our  ab¬ 
stract  linear  evolution  equation  (3.2)  can  represent  a  wide  range  of  linear  partial  differential 
equations.  For  example,  if 

du 


L[il]  =  —  c{x) 


dx  ’ 


(3.6) 


where  c(pc)  is  a  function  representing  the  wave  speed  in  a  nonuniform  medium,  then  (3.2) 
becomes  the  transport  equation 

9“  =  -  <=(*)  ^  (3.7) 


dt 


dx 


that  we  studied  in  Chapter  2.  If 


L[u 


1 


d 


cr(x)  dx 


(3.8) 


where  cr(x)  >  0  represents  heat  capacity  and  n{x)  >  0  thermal  conductivity ,  then  (3.2) 
becomes  the  generalized  heat  equation 


du 


1 


d 


dt  cr(x)  dx  \  v  y  dx 

governing  the  diffusion  of  heat  in  a  nonuniform  bar.  If 


(3.9) 


L[u 


d 2 


u 


dx2 


T  U- 


(3.10) 


where  7  >  0  is  a  positive  constant,  then  (3.2)  becomes  the  damped  heat  equation 


du  d2u 


dt  dx2 


7  u, 


(3.11) 


which  models  the  temperature  of  a  bar  that  is  cooling  off  due  to  radiation  of  heat  energy. 
We  can  even  take  u  to  be  a  function  of  more  than  one  space  variable,  e.g.,  u(t,x,y)  or 
u(t,  x,y,  z),  in  which  case  (3.2)  includes  higher-dimensional  versions  of  the  heat  equation 
for  plates  and  solid  bodies,  which  we  will  study  in  due  course.  In  all  cases,  the  key 
requirements  on  the  operator  L  are  (a)  linearity,  and  (b)  only  differentiation  with  respect 
to  the  spatial  variables  is  allowed. 

Fourier’s  inspired  idea  for  solving  such  linear  evolution  equations  is  a  direct  adaptation 
of  the  eigensolution  method  for  first-order  linear  systems  of  ordinary  differential  equations, 
20,23,89],  which  we  now  recall.  The  starting  point  is  the  elementary  scalar  ordinary 
differential  equation 

-  =  Aw.  3.12) 

dt  v  7 


^  We  assume  throughout  that  the  functions  are  sufficiently  smooth  so  that  the  indicated 
derivatives  are  well  defined. 
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3  Fourier  Series 


The  general  solution  is  an  exponential  function 

u(t)  =  cext,  (3.13) 

whose  coefficient  c  is  an  arbitrary  constant.  This  elementary  observation  motivates  the 
solution  method  for  a  first-order  homogeneous  linear  system  of  ordinary  differential  equa¬ 
tions 

du  ,  „ 

~dt=U’  F  ) 

in  which  A  is  a  constant  n  x  n  matrix.  Working  by  analogy,  we  will  seek  solutions  of 
exponential  form 

u(t)  =  eAtv,  (3.15) 

where  v  E  IRn  is  a  constant  vector.  We  substitute  this  ansatz t  into  the  equation.  First, 


du 

dt 


d 

dt 


v. 


On  the  other  hand,  since  ext 


is  a  scalar,  it  commutes  with  matrix  multiplication,  and  so 


du  =  Aextv  =  eAtdv. 


Therefore,  u (t)  will  solve  the  system  (3.14)  if  and  only  if  v  satisfies 

Av  =  Av.  (3.16) 

We  recognize  this  as  the  eigenequation  that  determines  the  eigenvalues  of  the  matrix  A. 
Namely,  (3.16)  has  a  nonzero  solution  v  ^  0  if  and  only  if  A  is  an  eigenvalue  and  v  a 
corresponding  eigenvector.  Each  eigenvalue  A  and  eigenvector  v  produces  a  nonzero,  expo¬ 
nentially  varying  eigensolution  (3.15)  to  the  linear  system  of  ordinary  differential  equations. 

Remark :  Any  nonzero  scalar  multiple  of  an  eigenvector  v  =  cv,  for  c  ^  0,  is  auto¬ 
matically  another  eigenvector  for  the  same  eigenvalue  A.  However,  the  only  effect  is  to 
multiply  the  eigensolution  by  the  scalar  c.  Thus,  to  obtain  a  complete  system  of  indepen¬ 
dent  solutions,  we  need  only  the  independent  eigenvectors. 

For  simplicity  —  and  also  because  all  of  the  linear  partial  differential  equations  we 
will  treat  will  have  the  analogous  property  —  suppose  that  the  n  x  n  matrix  A  has  a 
complete  system  of  real  eigenvalues  A1? . . . ,  An  and  corresponding  real,  linearly  independent 
eigenvectors  v1, . . . ,  vn,  which  therefore  form  an  eigenvector  basis  of  the  underlying  space 
IRn.  (We  allow  the  possibility  of  repeated  eigenvalues,  but  require  that  all  eigenvectors  be 
independent  to  avoid  superfluous  solutions.)  For  example,  according  to  Theorem  B.26  (see 
also  [89;  Theorem  8.20]),  all  real,  symmetric  matrices,  A  =  AT ,  are  complete.  Complex 
eigenvalues  lead  to  complex  exponential  solutions,  whose  real  and  imaginary  parts  can  be 
used  to  construct  the  associated  real  solutions.  Incomplete  matrices,  having  an  insufficient 
number  of  eigenvectors,  are  trickier,  and  the  solution  to  the  corresponding  linear  system 


^  The  German  word  ansatz  refers  to  the  method  of  finding  a  solution  to  a  complicated  equation 
by  postulating  that  it  is  of  a  special  form.  Usually,  an  ansatz  will  depend  on  one  or  more  free 
parameters  —  in  this  case,  the  entries  of  the  vector  v  along  with  the  scalar  A  —  that,  with  some 
luck,  can  be  adjusted  to  fulfill  the  requirements  imposed  by  the  equation.  Thus,  a  reasonable 
English  translation  of  “ansatz”  is  “inspired  guess”. 
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requires  use  of  the  Jordan  canonical  form,  [89;  Section  8.6].  Fortunately,  we  do  not  have 
to  deal  with  the  latter,  technically  annoying,  cases  here. 

Using  our  completeness  assumption,  we  can  produce  n  independent  real  exponential 
eigensolutions 


_Ait 


V 


1? 


«n(*) 


eXntv 


n  i 


to  the  linear  system  (3.14).  The  Linear  Superposition  Principle  of  Theorem  1.4  tells  ns 
that,  for  any  choice  of  scalars  c1? . . . ,  cn,  the  linear  combination 


°1u1{t)+  ■■■  +cnun(t)  =  c1eAltv1  +  •••  +cneA"4vn  (3.17) 

is  also  a  solution.  The  basic  Existence  and  Uniqueness  Theorems  for  first-order  systems  of 
ordinary  differential  equations,  [18,23,52],  imply  that  (3.17)  forms  the  general  solution 
to  the  original  linear  system,  and  so  the  eigensolutions  form  a  basis  for  the  solution  space. 

Let  ns  now  adapt  this  seminal  idea  to  construct  exponentially  varying  solutions  to  the 
heat  equation  (3.1)  or,  for  that  matter,  any  linear  evolution  equation  in  the  form  (3.2).  To 
this  end,  we  introduce  an  analogous  exponential  ansatz: 


u(t,  x)  =  ext  v(x) 


(3.18) 


in  which  we  replace  the  vector  v  in  (3.15)  by  a  function  v(x).  We  substitute  the  expression 
(3.18)  into  the  dynamical  equations  (3.2).  First,  the  time  derivative  of  such  a  function  is 


du  d 


ext  v(x)  =  A  eAt  v{x) 


dt  dt  L 

On  the  other  hand,  in  view  of  (3.5), 


A  t 


L[u ]  =  L[ext  v(x)  J  =  eAt  L[v]. 

Equating  these  two  expressions  and  canceling  the  common  exponential  factor,  we  conclude 
that  v(x)  must  satisfy  the  eigenequation 


,A  t 


L[v]  =  A  v  (3.19) 

for  the  linear  differential  operator  L,  in  which  A  is  the  eigenvalue ,  while  v(pc)  is  the  corre¬ 
sponding  eigenfunction.  Each  eigenvalue  and  eigenfunction  pair  will  produce  an  exponen¬ 
tially  varying  eigensolution  (3.18)  to  the  partial  differential  equation  (3.2).  We  will  then 
appeal  to  Linear  Superposition  to  combine  the  resulting  eigensolutions  to  form  additional 
solutions.  The  key  complication  is  that  partial  differential  equations  admit  an  infinite 
number  of  independent  eigensolutions,  and  thus  one  cannot  hope  to  write  the  general  solu¬ 
tion  as  a  finite  linear  combination  thereof.  Rather,  one  is  led  to  try  constructing  solutions 
as  infinite  series  in  the  eigensolutions.  However,  justifying  such  series  solution  formulas 
requires  additional  analytical  skills  and  sophistication.  Not  every  infinite  series  converges 
to  a  bona  fide  function.  Moreover,  a  convergent  series  of  differentiable  functions  need  not 
converge  to  a  differentiable  function,  and  hence  the  series  may  not  represent  a  (classical) 
solution  to  the  partial  differential  equation.  We  are  being  reminded,  yet  again,  that  partial 
differential  equations  are  much  wilder  creatures  than  their  relatively  tame  cousins,  ordinary 
differential  equations. 

Let  ns,  for  specificity,  focus  our  attention  on  the  heat  equation,  for  which  the  linear 
operator  L  is  given  by  (3.3).  If  v{x)  is  a  function  of  x  alone,  then 

L[v]  =  vn{pc). 
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Thus,  our  eigenequation  (3.19)  becomes 


v"  =  \v.  (3.20) 

This  is  a  linear  second-order  ordinary  differential  equation  for  v(x),  and  so  has  two  linearly 
independent  solutions.  The  explicit  solution  formulas  depend  on  the  sign  of  the  eigenvalue 
A,  and  can  be  found  in  any  basic  text  on  ordinary  differential  equations,  e.g.,  [20,  23].  The 
following  table  summarizes  the  results  for  real  eigenvalues  A;  the  case  of  complex  A  is  left 
as  Exercise  3.1.3  for  the  reader.  The  resulting  exponential  eigensolutions  are  also  referred 
to  as  separable  solutions  to  indicate  that  they  are  the  product  of  a  function  of  t  alone  and  a 
function  of  x  alone.  The  general  method  of  separation  of  variables  will  be  one  of  our  main 
tools  for  solving  linear  partial  differential  equations,  to  be  developed  in  detail  starting  in 
Chapter  4. 


Real  Eigensolutions  of  the  Heat  Equation 


A 

Eigenfunctions  v{x) 

Eigensolutions  u(t,x)  =  ext  v(x) 

A  =  —  ce2  <  0 

coscex,  sincere 

2  ,  2  , 
e  u  coscex,  e  u  sincex 

A  =  0 

1,  X 

1,  X 

A  =  ce2  >  0 

g—  uj  x  go;  x 

2  2 
gU;  t — uj  x  gU ;  t-\-uix 

Remark :  Thus,  in  the  absence  of  boundary  conditions,  each  real  number  A  qualifies  as 
an  eigenvalue  of  the  linear  differential  operator  (3.3),  possessing  two  linearly  independent 
eigenfunctions,  and  thus  two  linearly  independent  eigensolutions  to  the  heat  equation.  As 
with  eigenvectors,  any  (nonzero)  linear  combination  of  eigenfunctions  (eigensolutions)  with 
the  same  eigenvalue  is  also  an  eigenfunction  (eigensolution).  Thus,  the  preceding  table  lists 
only  independent  eigenfunctions  and  eigensolutions. 

As  noted  above,  any  finite  linear  combination  of  these  basic  eigensolutions  is  auto¬ 
matically  a  solution.  Thus,  for  example, 

u{t,  x )  =  c1e_t  cosx  +  c2e_4t  sin  2x  +  c3x  +  c4 

is  a  solution  to  the  heat  equation  for  any  choice  of  constants  c1,  c2,  c3,  c4,  as  you  can  easily 
check.  But,  since  there  are  infinitely  many  independent  eigensolutions,  we  cannot  expect 
to  be  able  to  represent  every  solution  to  the  heat  equation  as  a  finite  linear  combination 
of  eigensolutions.  And  so,  we  must  learn  how  to  deal  with  infinite  series  of  eigensolutions. 

Remark :  Eigensolutions  in  the  first  class,  where  A  <  0,  are  exponentially  decaying, 
which  is  in  accord  with  our  physical  intuition  as  to  how  the  temperature  of  a  body  should 
behave.  Those  in  the  second  class  are  constant  in  time  —  also  physically  reasonable.  How¬ 
ever,  those  in  the  third  class,  corresponding  to  positive  eigenvalues  A  >  0,  are  exponentially 
growing  in  time.  In  the  absence  of  external  heat  sources,  physical  bodies  should  approach 
some  sort  of  thermal  equilibrium,  and  certainly  not  exhibit  an  exponentially  growing  tem¬ 
perature!  However,  notice  that  the  latter  eigensolutions  (as  well  as  the  solution  x)  are  not 
bounded  in  space,  and  so  include  an  infinite  amount  of  heat  energy  being  supplied  to  the 
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system  from  infinity.  As  we  will  soon  come  to  appreciate,  physically  relevant  boundary 
conditions  —  posed  either  on  a  bounded  interval  or  by  specifying  the  asymptotics  of  the 
solutions  at  large  distances  —  will  separate  out  the  physically  reasonable  solutions  from 
the  mathematically  valid  but  physically  irrelevant  ones. 


The  Heated  Ring 


So  far,  we  have  not  paid  any  attention  to  boundary  conditions.  As  noted  above,  these  will 
eliminate  nonphysical  eigensolutions  and  thereby  reduce  the  collection  to  a  manageable, 
albeit  still  infinite,  number.  In  this  subsection,  we  will  discuss  a  particularly  important 
case,  which,  following  Fourier’s  line  of  reasoning,  leads  us  directly  into  the  heart  of  Fourier 
series. 

Consider  the  heat  equation  on  the  interval  —  tt  <  x  <  tt,  subject  to  the  periodic 
boundary  conditions 


du  d2u 


u(t ,  —7 r)  =  n(t,  7 r) 


du 


(i,  —  tt) 


du 


(t,Tr) 


(3.21) 


dt  dx 2  ’  v  7  '  v  7  n  dx  v  7  y  dx 

The  physical  problem  being  modeled  is  the  thermodynamic  behavior  of  an  insulated  circular 
ring,  in  which  x  represents  the  angular  coordinate.  The  boundary  conditions  ensure  that 
the  temperature  remains  continuously  differentiable  at  the  junction  point  where  the  angle 
switches  over  from  —  7r  to  7r.  Given  the  ring’s  initial  temperature  distribution 


u( 0,x)  =  /(x),  —7 r  <  x  <7 r,  (3.22) 

our  task  is  to  determine  the  temperature  of  the  ring  u(t,  x)  at  each  subsequent  time  t  >  0. 

Let  us  find  out  which  of  the  preceding  eigensolutions  respect  the  boundary  conditions. 
Substituting  our  exponential  ansatz  (3.18)  into  the  differential  equation  and  boundary 
conditions  (3.21),  we  find  that  the  eigenfunction  v(x)  must  satisfy  the  periodic  boundary 
value  problem 


v"  =  \v,  v(—  tt)  =  'c(tt),  v\—  tt)  =  (3.23) 

Our  task  is  to  find  those  values  of  A  for  which  (3.23)  has  a  nonzero  solution  v{x)  ^  0. 
These  are  the  eigenvalues  and  eigenfunctions. 

As  noted  above,  there  are  three  cases,  depending  on  the  sign  of  A.  First,  suppose 
A  =  uj2  >  0.  Then  the  general  solution  to  the  ordinary  differential  equation  is 

v{x)  =  aeux  +  be~ux , 

where  a,  b  are  arbitrary  constants.  Substituting  into  the  boundary  conditions,  we  find  that 
a,  b  must  satisfy  the  pair  of  linear  equations 

ae~UJ7T  +  be W7r  =  aeU7T  +  be _a;7r,  -  bueu*  =  auoe^^  - 

Since  uj  ^  0,  the  first  equation  implies  that  a  =  6,  while  the  second  requires  a  =  —b.  So, 
the  only  way  to  satisfy  both  boundary  conditions  is  to  take  a  =  b  =  0,  and  so  v{x)  =  0  is 
a  trivial  solution.  We  conclude  that  there  are  no  positive  eigenvalues. 

Second,  if  A  =  0,  then  the  ordinary  differential  equation  reduces  to  v"  =  0,  with 
solution 


v{x)  =  a  +  bx. 
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Substituting  into  the  boundary  conditions  requires 

a  —  bn  =  a  +  6  7r,  b  =  b. 

The  first  equation  implies  that  6  =  0,  but  this  is  the  only  condition.  Therefore,  any  constant 
function,  v(x)  =  a,  solves  the  boundary  value  problem,  and  hence  A  =  0  is  an  eigenvalue. 
We  take  v0(x)  =  1  as  the  unique  independent  eigenfunction,  bearing  in  mind  that  any 
constant  multiple  of  an  eigenfunction  is  automatically  also  an  eigenfunction.  We  will  call 
1  a  null  eigenfunction ,  indicating  that  it  is  associated  with  the  zero  eigenvalue  A  =  0.  The 
corresponding  eigensolution  (3.18)  is  u(t,x)  =  eotv0(x )  =  1,  a  constant  solution  to  the 
heat  equation. 

Finally,  we  must  deal  with  the  case  A  =  —  ce2  <  0.  Now,  the  general  solution  to  the 
differential  equation  in  (3.23)  is  a  trigonometric  function: 

v(x)  =  a  cosujx  +  bsmuux.  (3.24) 

Since 

vf(x)  =  —  cluj  since x  +  buu  coscex, 

when  we  substitute  into  the  boundary  conditions,  we  obtain 

a  cosujJT  —  b  sinceTT  =  a  coscctt  +  6  since7r, 
a  sin  uj  tt  +  6  cos  ujtt  =  —  a  sin  uj  tt  +  6  cos  uo  it  , 

where  we  canceled  out  a  common  factor  of  uj  in  the  second  equation.  These  simplify  to 

26since7r  =  0,  2asince7r  =  0. 

If  since7r  7^  0,  then  a  =  6  =  0,  and  so  we  have  only  the  trivial  solution  v(x)  =  0.  Thus,  to 
obtain  a  nonzero  eigenfunction,  we  must  have 

since7r  =  0, 

which  requires  that  ce  =  l,2,3,...bea  positive  integer.  For  such  uuk  =  fc,  every  solution 

v(x)  =  a  cos  kx  +  6sin  kx,  k  =  1,2,3,..., 

satisfies  both  boundary  conditions,  and  hence  (unless  identically  zero)  qualifies  as  an  eigen¬ 
function  of  the  boundary  value  problem.  Thus,  the  eigenvalue  A k  =  —k2  admits  a  two- 
dimensional  space  of  eigenfunctions,  with  basis  vk(x)  =  cos  kx  and  vk(x)  =  sin  kx. 
Consequently,  the  basic  trigonometric  functions 

1,  cosx,  sinx,  cos2x,  sin2x,  cos3x,  ...  (3.25) 

form  a  system  of  independent  eigenfunctions  for  the  periodic  boundary  value  problem 
(3.23).  The  corresponding  exponentially  varying  eigensolntions  are 

2  2 

uk(x)  =  e~k  tcoskx,  uk(x)  =  e~k  t  sin  kx,  k  =  0, 1,2,3, ...,  (3.26) 

each  of  which,  by  design,  is  a  solution  to  the  heat  equation  (3.21)  and  satisfies  the  periodic 
boundary  conditions.  Note  that  we  subsumed  the  case  A0  =  0  into  (3.26),  keeping  in  mind 
that,  when  k  =  0,  the  sine  function  is  trivial,  and  hence  u0(x)  =  0  is  not  needed.  So  the  null 
eigenvalue  A0  =  0  provides  (up  to  a  constant  multiple)  only  one  eigensolution,  whereas  the 
strictly  negative  eigenvalues  Xk  =  —  k2  <  0  each  provide  two  independent  eigensolntions. 
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Remark :  For  completeness,  one  should  also  consider  the  possibility  of  complex  eigen¬ 
values.  If  A  =  u2  7^  0,  where  ce  is  now  allowed  to  be  complex,  then  all  solutions  to  the 
differential  equation  (3.23)  are  of  the  form 

v{x)  =  aeux  +  be~UJX . 

The  periodic  boundary  conditions  require 


ae  U7T  +  beU7T  =  aeU7T  +  be  a;7r,  auoe  U7T  —  buo eU7T  =  ace eU7T  —  buo e  UJ7T. 

If  eU7T  7^  e_a;7r,  or,  equivalently,  e2uJ7T  7^  1,  then  the  first  condition  implies  a  =  6,  but  then 
the  second  implies  a  =  b  =  0,  and  so  A  =  ce2  is  not  an  eigenvalue.  Thus,  the  eigenvalues 
only  occur  when  e2uJ7T  =  1.  This  implies  uj  =  k  i,  where  k  is  an  integer,  and  so  A  =  -fe2, 
leading  back  to  the  known  trigonometric  solutions.  Later,  in  Section  9.5,  we  will  learn  that 
the  “self-adjoint”  structure  of  the  underlying  boundary  value  problem  implies,  a  priori,  that 
all  its  eigenvalues  are  necessarily  real  and  nonpositive.  So  a  good  part  of  the  preceding 
analysis  was,  in  fact,  superfluous. 

We  conclude  that  there  is  an  infinite  number  of  independent  eigensolutions  (3.26)  to 
the  periodic  heat  equation  (3.21).  Linear  Superposition,  as  described  in  Theorem  1.4,  tells 
us  that  any  finite  linear  combination  of  the  eigensolutions  is  automatically  a  solution  to 
the  periodic  heat  equation.  However,  only  solutions  whose  initial  data  u{ 0,x)  =  f(x)  hap¬ 
pens  to  be  a  finite  linear  combination  of  the  trigonometric  eigenfunctions  (a  trigonometric 
polynomial)  can  be  so  represented.  Fourier’s  brilliant  idea  was  to  propose  taking  infinite 
“linear  combinations”  of  the  eigensolutions  in  an  attempt  to  solve  the  general  initial  value 
problem.  Thus,  we  try  representing  a  general  solution  to  the  periodic  heat  equation  as  an 
infinite  series  of  the  form^ 


00 


u(£,  x) 


a 


0 


+  E 


ak  e 


k2t 


cos  k  x  +  bk  e 


k2t 


sin  kx 


(3.27) 


k=  1 

The  coefficients  a0,  a2, . . . ,  61?  62,  •  •  • ,  are  constants,  to  be  fixed  by  the  initial  condition. 

Indeed,  substituting  our  proposed  solution  formula  (3.27)  into  (3.22),  we  obtain 


f(x)  =  u(0,  x ) 


00 

—  +  [  ak  cos  k  x  +  bk  sin  k  x 

2 

k=  1 


(3.28) 


Thus,  we  must  represent  the  initial  temperature  distribution  f(x)  as  an  infinite  Fourier 
series  in  the  elementary  trigonometric  eigenfunctions.  Once  we  have  prescribed  the  Fourier 
coefficients  a0,  ax,  a2, . . . ,  62,  •  •  •  ,  we  expect  that  the  corresponding  eigensolution  series 

(3.27)  will  provide  an  explicit  formula  for  the  solution  to  the  periodic  initial-boundary 
value  problem  for  the  heat  equation. 

However,  infinite  series  are  much  more  delicate  than  finite  sums,  and  so  this  formal 
construction  requires  some  serious  mathematical  analysis  to  place  it  on  a  rigorous  founda¬ 
tion.  The  key  questions  are: 

•  When  does  an  infinite  trigonometric  Fourier  series  converge? 

•  What  kinds  of  functions  f(x)  can  be  represented  by  a  convergent  Fourier  series? 


^  For  technical  reasons,  one  takes  the  basic  null  eigenfunction  to  be  \  instead  of  1.  The  reason 
for  this  choice  will  be  revealed  in  the  following  section. 
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•  Given  such  a  function,  how  do  we  determine  its  Fourier  coefficients  ak,bk? 

•  Are  we  allowed  to  differentiate  a  Fourier  series? 

•  Does  the  result  actually  form  a  solution  to  the  initial-boundary  value  problem  for  the 

heat  equation? 

These  are  the  basic  issues  in  Fourier  analysis,  which  must  be  properly  addressed  before  we 
can  make  any  serious  progress  towards  actually  solving  the  heat  equation.  Thus,  we  will 
leave  partial  differential  equations  aside  for  the  time  being,  and  start  a  detailed  investigation 
into  the  mathematics  of  Fourier  series. 


Exercises 


3.1.1.  For  each  of  the  following  differential  operators,  (y)  prove  linearity;  ( ii )  prove  (3.5); 
(in)  write  down  the  corresponding  linear  evolution  equation  (3.2): 

f  \  d  .  N  d  ,  N  d2  d  ,  A  d  x  d  ,  N  d2  d 2 

(a)  (k)  "HZ  +  1,  (c)  ^2+3—,  (d)  —  e  — ,  (e)  7^7  + 


dx 


dx 


dx 


dx  dx 


dx 2  dy 2 


3.1.2.  Find  all  separable  eigensolutions  to  the  heat  equation  ut  =  uxx  on  the  interval  0  <  x  <  tt 
subject  to  (a)  homogeneous  Dirichlet  boundary  conditions  u(t,  0)  =  0,  u(t,  tt)  =  0; 

(b)  mixed  boundary  conditions  u(t,  0)  =  0,  ux(t,  tt)  =  0; 

(c)  Neumann  boundary  conditions  ux(t,  0)  =  0,  ux(t,  tt)  =  0. 

3.1.3.  Complete  the  table  of  eigensolutions  to  the  heat  equation,  in  the  absence  of  boundary 
conditions,  by  allowing  the  eigenvalue  A  to  be  complex. 


3.1.4.  Find  all  separable  eigensolutions  to  the  following  partial  differential  equations: 

(a)  ut  =  ux,  (b)  uf  =  ux  —  a,  (c)  ut  =  xux. 

3.1.5.  (a)  Find  the  real  eigensolutions  to  the  damped  heat  equation  ut  =  uxx  —  u.  (b)  Which 
solutions  satisfy  the  periodic  boundary  conditions  u(t,  —tt)  =  u(t,  tt),  ux(t ,  —tt)  =  ux(t ,  tt)? 

3.1.6.  Answer  Exercise  3.1.5  for  the  diffusive  transport  equation  ut  +  cux  =  uxx  modeling  the 
combined  diffusion  and  transport  of  a  solute  in  a  uniform  flow  with  constant  wave  speed  c. 

C  3.1.7.  (a)  Find  the  real  eigensolutions  to  the  diffusion  equation  ut  =  ( x 2  ux)x  modeling  diffusion 
in  an  inhomogeneous  medium  on  the  half- line  x  >  0. 

(b)  Which  solutions  satisfy  the  Dirichlet  boundary  conditions  u(t,  1)  =  u(t,  2)  =  0? 


3.2  Fourier  Series 


The  preceding  section  served  to  motivate  the  development  of  Fourier  series  as  a  tool  for 
solving  partial  differential  equations.  Our  immediate  goal  is  to  represent  a  given  function 
f(x)  as  a  convergent  series  in  the  elementary  trigonometric  functions: 


oo 

—  +  [  ak  cos  k  x  +  bk  sin  k  x 

2 

k=  1 


(3.29) 


The  first  order  of  business  is  to  determine  the  formulae  for  the  Fourier  coefficients  ak,bk ; 
only  then  will  we  deal  with  convergence  issues. 
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The  key  that  unlocks  the  Fourier  treasure  chest  is  orthogonality.  Recall  that  two  vec¬ 
tors  in  Euclidean  space  are  called  orthogonal  if  they  meet  at  a  right  angle.  More  explicitly, 
v,w  are  orthogonal  if  and  only  if  their  dot  product  is  zero:  v  •  w  =  0.  Orthogonality, 
and  particularly  orthogonal  bases,  has  profound  consequences  that  underpin  many  mod¬ 
ern  computational  algorithms.  See  Section  B.4  for  the  basics,  and  [89]  for  full  details  on 
finite-dimensional  developments.  In  infinite-dimensional  function  space,  were  it  not  for  or¬ 
thogonality,  Fourier  theory  would  be  vastly  more  complicated,  if  not  completely  impractical 
for  applications. 

The  starting  point  is  the  introduction  of  a  suitable  inner  product  on  function  space,  to 
assume  the  role  played  by  the  dot  product  in  the  finite-dimensional  context.  For  classical 
Fourier  series,  we  use  the  rescaled  L2  inner  product 


(f  ,9 


1 

7 r 


*7T 


f(x)  g(x)  dx 


(3.30) 


■7 T 


on  the  space  of  continuous  functions  defined  on  the  interval  [  —  tt,  tt ] .  It  is  not  hard  to 
show  that  (3.30)  satisfies  the  basic  inner  product  axioms  listed  in  Definition  B.10.  The 
associated  norm  is 


/  II  =  s/UJ)  =  p  £  /W 


dx 


(3.31) 


Lemma  3.1.  Under  the  rescaled  L2  inner  product  (3.30),  the  trigonometric  functions 
1,  cosx,  sin  ay  cos  2  ay  sin  2  ay  . . .  ,  satisfy  the  following  orthogonality  relations : 


( cos  k  x  ,  cos  l  x )  =  ( sin  k  x  ,  sin  l  x 
( cos  kx  ,  sin  lx)  =  0, 

1  =V2, 


=  0, 


cos  kx  II  =  ||  sin  kx  =  1, 


for  k  7^  /, 

for  all  fc,Z, 
for  k  7^  0. 


(3.32) 


where  k  and  l  indicate  nonnegative  integers. 


Proof :  The  formulas  follow  immediately  from  the  elementary  integration  identities 


*7 r 


cos  k  x  cos  Ixdx  = 


—  7 r 


0,  k  7^  /, 

27T,  k  =  l  =  0, 

7 r,  k  —  l  ^  0, 


*7 r 


sin  kx  sin  Ixdx  — 


—  TT 


0,  k  7^  /, 

7 r,  k  =  l  7^  0, 


cos  kx  sin  Ixdx  =  0, 


(3.33) 


which  are  valid  for  all  nonnegative  integers  M>  o. 


Q.E.D. 


Lemma  3.1  implies  that  the  elementary  trigonometric  functions  form  an  orthogonal 
system ,  meaning  that  any  distinct  pair  are  orthogonal  under  the  chosen  inner  product.  If 
we  were  to  replace  the  constant  function  1  by  -^j,  then  the  resulting  functions  would  form 

an  orthonormal  system  meaning  that,  in  addition,  they  all  have  norm  1.  However,  the 
extra  y/2  is  utterly  annoying,  and  best  omitted. 


^  We  have  chosen  to  use  the  interval  [  —  tt,  tt ]  for  convenience.  A  common  alternative  is  to 
develop  Fourier  series  on  the  interval  [0,  2 tt ] .  In  fact,  since  the  basic  trigonometric  functions  are 
27r-periodic,  any  interval  of  length  2tt  will  serve  equally  well.  Adapting  Fourier  series  to  other 
intervals  will  be  discussed  in  Section  3.4. 
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Remark :  As  with  all  essential  mathematical  facts,  the  orthogonality  of  the  trigonomet¬ 
ric  functions  is  not  an  accident,  but  indicates  that  something  deeper  is  going  on.  Indeed, 
orthogonality  is  a  consequence  of  the  fact  that  the  trigonometric  functions  are  the  eigen¬ 
functions  for  the  “self-adjoint”  boundary  value  problem  (3.23),  which  is  the  function  space 
counterpart  to  the  orthogonality  of  eigenvectors  of  symmetric  matrices,  cf.  Theorem  B.26. 
The  general  framework  will  be  developed  in  detail  in  Section  9.5,  and  then  applied  to  the 
more  complicated  systems  of  eigenfunctions  we  will  encounter  when  dealing  with  higher¬ 
dimensional  partial  differential  equations. 


If  we  ignore  convergence  issues,  then  the  trigonometric  orthogonality  relations  serve 
to  prescribe  the  Fourier  coefficients:  Taking  the  inner  product  of  both  sides  of  (3.29)  with 
cos  lx  for  l  >  0,  and  invoking  linearity  of  the  inner  product,  yields 


(  /  , cos  lx 


oo 

l,cos  lx)  +  [  ak  ( cos  kx  ,  cos  lx  )  +  bk  ( sin  kx  ,  cos  lx  ) 

k=  1 


=  al  ( cos  l  x  ,  cos  lx)  =  at , 


since,  by  the  orthogonality  relations  (3.32),  all  terms  but  the  /th  vanish.  This  serves  to  pre¬ 
scribe  the  Fourier  coefficient  at .  A  similar  manipulation  with  sin  l  x  fixes  bt  =  (  /  ,  sin  lx), 
while  taking  the  inner  product  with  the  constant  function  1  gives 


oo 

/U)  = -7T  (1,1)  +  T  iak(coskxO)  +  bk(  sinfcx,1 


k=  1 


1  _  a0 

1 

J  2 

=  a 


o? 


which  agrees  with  the  preceding  formula  for  aL  when  1  =  0,  and  explains  why  we  include 
the  extra  factor  in  the  constant  term.  Thus,  if  the  Fourier  series  converges  to  the 
function  f(x),  then  its  coefficients  are  prescribed  by  taking  inner  products  with  the  basic 
trigonometric  functions. 


Definition  3.2.  The  Fourier  series  of  a  function  f{x)  defined  on  —  tt  <  x  <  n  is 


/  (x) 


a 


oo 


o 


+  [  a*,  cos  k  x  +  bk  sin  k  x  ]  , 

k=  1 


(3.34) 


whose  coefficients  are  given  by  the  inner  product  formulae 


ak  =  (  /  ,  cos  kx 


1 


7 r 


*7T 

7 r 
7 r 


f{x)  cos  kx  dx. 


1  r 

bk  =  (/,sinfcx)  =  —  /  / (x)  sin kxdx 
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k  =  0, 1,2,3,..., 


k  =  1,  2,  3, ... . 


(3.35) 


The  function  f(x)  cannot  be  completely  arbitrary,  since,  at  the  very  least,  the  integrals 
in  the  coefficient  formulae  must  be  well  defined  and  finite.  Even  if  the  coefficients  (3.35) 
are  finite,  there  is  no  guarantee  that  the  resulting  infinite  series  converges,  and,  even  if  it 
converges,  no  guarantee  that  it  converges  to  the  original  function  f{x).  For  these  reasons, 
we  will  tend  to  use  the  ~  symbol  instead  of  an  equal  sign  when  writing  down  a  Fourier 
series.  Before  tackling  these  critical  issues,  let  us  work  through  an  elementary  example. 
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Example  3.3.  Consider  the  function  /(x)  =  x.  We  may  compute  its  Fourier  coeffi¬ 
cients  directly,  employing  integration  by  parts  to  evaluate  the  integrals: 


ao  ~ 


1 

7 r 


bk  =  ~ 
7T 


*7 r 

x  dx  =  0,  ak  = 

-7T 

1 

x  sin  kx  dx  =  — 

7 r 


1 

7T 


—  7T 


1 

x  cos  kx  dx  =  — 

-7T 

xcos  kx  sin  kx1  l7r 

-  + 


xsin/cx  cos  kx 

-  + 


k 


k 2 


7T 


X  =  — 7T 


=  0, 


k 


k 2 


X  =  — 7T 


=  ~k  (-l)fc+1 


(3.36) 


The  resulting  Fourier  series  is 


.  sin2x  sin3x  sin4x 

x  ~  2  (  sin  x  —  — - —  +  — - —  —  — - —  + 


(3.37) 


Establishing  convergence  of  this  infinite  series  is  far  from  elementary.  Standard  calculus 
criteria,  including  the  ratio  and  root  tests,  are  inconclusive.  Even  if  we  know  that  the  series 
converges  (which  it  does  —  for  all  x),  it  is  certainly  not  obvious  what  function  it  converges 
to.  Indeed,  it  cannot  converge  to  the  function  /(x)  =  x  everywhere!  For  instance,  if  x  =  7 r, 
then  every  term  in  the  Fourier  series  is  zero,  and  so  it  converges  to  0  —  which  is  not  the 
same  as  /( 7r)  =  tt. 


Recall  that  the  convergence  of  an  infinite  series  is  predicated  on  the  convergence  of  its 
sequence  of  partial  sums ,  which,  in  this  case,  are 


n 

[  ak  cos  k  x  +  bk  sin  k  x 

Zj 

k=  1 


(3.38) 


By  definition,  the  Fourier  series  converges  at  a  point  x  if  and  only  if  its  partial  sums  have 
a  limit: 

lim  sn(x )  =  f(x),  (3.39) 

n  — >  oo 

which  may  or  may  not  equal  the  value  of  the  original  function  /(x).  Thus,  a  key  requirement 
is  to  find  conditions  on  the  function  /(x)  that  guarantee  that  the  Fourier  series  converges, 

and,  even  more  importantly,  that  the  limiting  sum  reproduces  the  original  function:  /(x)  = 
/(x).  This  will  all  be  done  in  detail  below. 


Remark :  A  finite  Fourier  sum,  of  the  form  (3.38),  is  also  known  as  a  trigonometric 
polynomial.  This  is  because,  by  trigonometric  identities,  it  can  be  re-expressed  as  a  poly¬ 
nomial  P(cosx,sinx)  in  the  cosine  and  sine  functions;  vice  versa,  every  such  polynomial 
can  be  uniquely  written  as  such  a  sum;  see  [89]  for  details. 


The  passage  from  trigonometric  polynomials  to  Fourier  series  might  be  viewed  as 
analogous  to  the  passage  from  polynomials  to  power  series.  Recall  that  the  Taylor  series 
of  an  infinitely  differentiable  function  /(x)  at  the  point  x  =  0  is 


oo 

f(x)  ~  c0  +  c1x+  ■■■  +cnxn+  ■■■  =  W  ckxk, 

k  —  0 

/(fc)(0) 

where,  according  to  Taylor’s  formula,  the  coefficients  ck  =  — — —  are  expressed  in  terms 

/c  • 

of  its  derivatives  at  the  origin,  not  by  an  inner  product.  The  partial  sums 

n 

SniX)  =  C0+C1X+  ■■■  +CnxU  =  F  Ckxk 

k  =  0 
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of  a  power  series  are  ordinary  polynomials,  and  the  same  basic  convergence  issues  arise. 

Although  superficially  similar,  in  actuality  the  two  theories  are  profoundly  different. 
Indeed,  while  the  theory  of  power  series  was  well  established  in  the  early  days  of  the 
calculus,  there  remain,  to  this  day,  unresolved  foundational  issues  in  Fourier  theory.  A 
power  series  in  a  real  variable  x  either  converges  everywhere,  or  on  an  interval  centered 
at  0,  or  nowhere  except  at  0.  On  the  other  hand,  a  Fourier  series  can  converge  on  quite 
bizarre  sets.  Secondly,  when  a  power  series  converges,  it  converges  to  an  analytic  function, 
whose  derivatives  are  represented  by  the  differentiated  power  series.  Fourier  series  may 
converge,  not  only  to  continuous  functions,  but  also  to  a  wide  variety  of  discontinuous 
functions  and  even  more  general  objects.  Therefore,  term- wise  differentiation  of  a  Fourier 
series  is  a  nontrivial  issue. 

Once  one  appreciates  how  radically  different  the  two  subjects  are,  one  begins  to  un¬ 
derstand  why  Fourier’s  astonishing  claims  were  initially  widely  disbelieved.  Before  that 
time,  all  functions  were  taken  to  be  analytic.  The  fact  that  Fourier  series  might  converge 
to  a  nonanalytic,  even  discontinuous  function  was  extremely  disconcerting,  resulting  in  a 
profound  re-evaluation  of  the  foundations  of  function  theory  and  the  calculus,  culminating 
in  the  modern  definitions  of  function  and  convergence  that  you  now  learn  in  your  first 
courses  in  analysis,  [8,  96,  97].  Only  through  the  combined  efforts  of  many  of  the  leading 
mathematicians  of  the  nineteenth  century  was  a  rigorous  theory  of  Fourier  series  firmly 
established.  Section  3.5  contains  the  most  important  details,  while  more  comprehensive 
treatments  can  be  found  in  the  advanced  texts  [37,  68,  128  . 


Exercises 


3.2.1.  Find  the  Fourier  series  of  the  following  functions: 

O  Q 

(c)  3x  —  1,  (d)  x  ,  (e)  sin  x,  (f)  sin  x  cos  ay 


3.2.2.  Find  the  Fourier  series  of  the  following  functions: 


(a)  j 

1, 

X 

<  U 

(b)  { 

,  0,  otherwise, 

(d)  < 

f  X, 

X 

< 

(e)  | 

o, 

otherwise, 

1. 


U  < 


x 


<  7 r. 


0,  otherwise. 


cos  x . 
0, 


x 


< 


(a)  sign  ay  (b)  \x 


(. s ) 


sinx 


(c) 


L 

0. 


(h)  X  cosax 


^  7T  <  X  <  7T, 

otherwise. 


otherwise. 

3.2.3.  Find  the  Fourier  series  of  sin2  x  and  cos2  x  without  directly  calculating  the  Fourier  coeffi¬ 
cients.  Hint :  Use  some  standard  trigonometric  identities. 


n 

0  3.2.4.  Let  g(x)  =  ^p0  +  (pk  cos kx  +  qk  sin/cx)  be  a  trigonometric  polynomial.  Explain  why 

k—1 

its  Fourier  coefficients  are  ak  =  pk  and  bk  =  qk  for  k  <  n,  while  ak  =  bk  =  0  for  k  >  n. 

3.2.5.  True  or  false:  (a)  The  Fourier  series  for  the  function  2  f(x)  is  obtained  by  multiplying 
each  term  in  the  Fourier  series  for  f(x)  by  2.  (b)  The  Fourier  series  for  the  function  /( 2x) 
is  obtained  by  replacing  x  by  2x  in  the  Fourier  series  for  f(x).  (c)  The  Fourier  coefficients 
of  f(x)  +  g(x)  can  be  found  by  adding  the  corresponding  Fourier  coefficients  of  f(x)  and 
g(x).  (d)  The  Fourier  coefficients  of  f(x)  g(x)  can  be  found  by  multiplying  the  correspond¬ 
ing  Fourier  coefficients  of  f(x)  and  g(x). 
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Figure  3.1.  27r-periodic  extension  of  x. 


Periodic  Extensions 

The  trigonometric  constituents  (3.25)  of  a  Fourier  series  are  all  periodic  functions  of  period 

2tt.  Therefore,  if  the  series  converges,  the  limiting  function  f{x)  must  also  be  periodic  of 
period  2tt: 

f{x  +  2tt)  =  f{x)  for  all  xGl. 

A  Fourier  series  can  converge  only  to  a  2  it -periodic  function.  So  it  was  unreasonable  to 
expect  the  Fourier  series  (3.37)  to  converge  to  the  aperiodic  function  f{x)  =  x  everywhere. 
Rather,  it  should  converge  to  its  “periodic  extension” ,  which  we  now  define. 

Lemma  3.4.  If  f{x)  is  any  function  defined  for  —  tt<x<tt,  then  there  is  a  unique 

2  7 r-periodic  function  f,  known  as  the  2  n -periodic  extension  of  f,  that  satishes  f{x)  =  f{x) 
for  all  —7i  <  x  <  tt. 


Proof :  Pictorially,  the  graph  of  the  periodic  extension  of  a  function  f{x)  is  obtained 
by  repeatedly  copying  the  part  of  its  graph  between  —  n  and  t r  to  adjacent  intervals  of 
length  27 r;  Figure  3.1  shows  a  simple  example.  More  formally,  given  j  G  1,  there  is  a 
unique  integer  m  such  that  {2  m—  1)  tt  <  x  <  (2m  + 1)  tt.  Periodicity  of  /  leads  us  to  define 


f{x)  =  f{x  —  2m7i)  =  f{x  —  2mn).  (3.40) 

In  particular,  if  —  tt  <  x  <  7r,  then  m  —  0,  and  hence  f{x)  —  f{x)  for  such  x.  The  proof 
that  the  resulting  function  /  is  27r-periodic  is  left  as  Exercise  3.2.8.  Q.E.D. 


Remark :  The  construction  of  the  periodic  extension  in  Lemma  3.4  uses  the  value  f{n) 
at  the  right  endpoint  and  requires  /(—  tt)  =  /(tt)  =  /( tt).  One  could,  alternatively,  require 

/(tt)  =  /(—  7r)  =  /(—  7r),  which,  if  /(—  7r)  ^  /(tt),  leads  to  a  slightly  different  27r-periodic 
extension  of  the  function.  There  is  no  a  priori  reason  to  prefer^one  over  the  other.  In  fact, 
as  we  shall  discover,  the  preferred  Fourier  periodic  extension  f{x)  takes  the  average  of  the 
two  values: 


/w  =  /(-*■)  =  M  fio  +  fi-o 


(3.41) 


which  then  fixes  its  values  at  the  odd  multiples  of  tt. 
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Example  3.5.  The  27r-periodic  extension  of  f(x)  =  x  is  the  “sawtooth”  function 
f(x)  graphed  in  Figure  3.1.  It  agrees  with  x  between  —  tt  and  n.  Since  /( 7r)  =  7r,  /(—  n)  = 
—  7 r,  the  Fourier  extension  (3.41)  sets  f(k n)  =  0  for  any  odd  integer  k.  Explicitly, 


x  —  2m7r1  (2m  —  1) tt  <  x  <  (2m  +  1) tt, 

0,  x  =  (2m  —  1)  tt, 


where  m  is  any  integer. 


With  this  convention,  it  can  be  proved  that  the  Fourier  series  (3.37)  converges  everywhere 
to  the  27T-periodic  extension  f(x ).  In  particular, 


2 


oo 


£(-D 


k+1 


sin  kx 
k 


X ,  ~TT  <  X  <  7T, 

0,  X  —  d=7T. 


(3.42) 


Even  this  very  simple  example  has  remarkable  and  nontrivial  consequences.  For  in¬ 
stance,  if  we  substitute  x  =  \tt  in  (3.42)  and  divide  by  2,  we  obtain  Gregory’s  series 


7T  1  1  1  1 

—  —  1  —  —  -(-  —  —  —  -(-  — 

4  3  5  7  9 


(3.43) 


While  this  striking  formula  predates  Fourier  theory  —  it  was,  in  fact,  first  discovered  by 
Leibniz  —  a  direct  proof  is  not  easy. 


Remark :  While  numerologically  fascinating,  Gregory’s  series  is  of  scant  practical  use 
for  actually  computing  tt,  since  its  rate  of  convergence  is  painfully  slow.  The  reader  may 
wish  to  try  adding  up  terms  to  see  how  far  out  one  needs  to  go  to  accurately  compute 
even  the  first  two  decimal  digits  of  tt.  Round-off  errors  will  eventually  interfere  with  any 
attempt  to  numerically  compute  the  summation  with  any  reasonable  degree  of  accuracy. 


Exercises 


3.2.6.  Graph  the  27r-periodic  extension  of  each  of  the  following  functions.  Which  extensions 
are  continuous?  Differentiable?  (a)  x2,  (b)  (x2  —  t r2)2,  (c)  ex,  (d)  e~  '  x  ' 

(i) 


2  1  1 
(e)  sinhx,  (f)  1  +  cos  r;  (g)  sini7rx,  ( h )  — 

x 


1 


2  "  v  J  X  ’  w  1  +  X2 

3.2.7.  Sketch  a  graph  of  the  27r-periodic  extension  of  each  of  the  functions  in  Exercise  3.2.2. 

3.2.8.  Complete  the  proof  of  Lemma  3.4  by  showing  that  f(x)  is  27r  periodic. 

0  3.2.9.  Suppose  f(x)  is  periodic  with  period  i  and  integrable.  Prove  that,  for  any  a, 

rCL- \-t  r£  r£ 

(a)  /  f(x)dx=  /  f(x)dx ,  (b)  /  f(x  +  a)  dx  =  /  f(x)dx. 

J  a  JO  JO  JO 

C  3.2.10.  Let  f(x)  be  a  sufficiently  nice  27r-periodic  function,  (a)  Prove  that  f'(x )  is  27r-periodic. 

/it  rx 

f(x)dx  =  0,  then  g(x)  =  /  f(y)dy  is  2tt- 

-7T  JO 

periodic;  (c)  Does  the  result  in  part  (b)  rely  on  the  fact  that  the  lower  limit  in  the  integral 

1  r 71 

for  g(x)  is  0?  (d)  More  generally,  prove  that  if  f(x)  has  mean  m  =  - —  /  f(x)  dx ,  then 


2tx 


rX 

the  function  g(x)  =  /  f(y)  dy  —  mx  is  27r-periodic. 

J  o 
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Figure  3.2.  Piecewise  continuous  function. 


0  3.2.11.  Given  a  function  f(x)  defined  for  0  <  x  <  l,  prove  that  there  is  a  unique  periodic 
function  of  period  i  that  agrees  with  /  on  the  interval  [0,£).  If  i  —  2tt,  is  this  the  same 
periodic  extension  as  we  constructed  in  the  text?  Explain  your  answer.  Try  the  case  f(x)  = 
x  as  an  illustrative  example. 


3.2.12.  Use  the  method  in  Exercise  3.2.11  to  construct  and  graph  the  1-periodic  extensions  of 

f  2  i  ~  i  -  1 

the  following  functions:  (a)  x2 ,  (b)  e_x,  (c)  cos 7 rx,  (0 


I  X  I  <  2  T 

0,  otherwise. 


3.2.13.  (a)  How  many  terms  in  Gregory’s  series  (3.43)  are  required  to  compute  the  first  two 
decimal  digits  of  i r?  (b)  The  first  10  decimal  digits?  Hint :  Use  the  fact  that  it  is  an  al¬ 
ternating  series,  (c)  For  part  (a),  try  summing  up  the  required  number  of  terms  on  your 
computer,  and  check  whether  you  obtain  an  accurate  result. 


Piecewise  Continuous  Functions 


As  we  shall  see,  all  continuously  differentiable  27r-periodic  functions  can  be  represented 
as  convergent  Fourier  series.  More  generally,  we  can  allow  functions  that  have  simple 
discontinuities. 


Definition  3.6.  A  function  f(x)  is  said  to  be  piecewise  continuous  on  an  interval 
a,  b]  if  it  is  defined  and  continuous  except  possibly  at  a  finite  number  of  points  a  <  xx  < 
x2  <  •  •  •  <  xn  <  b.  Furthermore,  at  each  point  of  discontinuity,  we  require  that  the  left- 
and  right-hand  limits 


f(xk)=  lim _  f(x), 

X  X, 
k 


/p+)=  lim  f(x), 

_ V  ' 

f  tT 


(3.44) 


exist.  (At  the  endpoints  a,  6,  existence  of  only  one  of  the  limits,  namely  /(u+)  and  f(b~ ) 
is  required.)  Note  that  we  do  not  require  that  f(x)  be  defined  at  xk.  Even  if  f(xk)  is 
defined,  it  does  not  necessarily  equal  either  the  left-  or  the  right-hand  limit. 


A  representative  graph  of  a  piecewise  continuous  function  appears  in  Figure  3.2.  The 
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Figure  3.3.  The  unit  step  function. 


points  xk  are  known  as  jump  discontinuities  of  /(#),  and  the  difference 

Pk  =  f(xt)  ~  f(xk )  =  lim,  f(x)  ~  lim_  f(x )  (3-45) 

ry*  V  ry*  '  ry*  V  ry* 

iT  f  tXy  (T  /  i/y  ^ 

between  the  left-  and  right-hand  limits  is  the  magnitude  of  the  jump.  Note  the  value  of 
the  function  at  the  discontinuity,  namely  f(xk)  —  which  may  not  even  be  defined  —  plays 
no  role  in  the  specification  of  the  jump  magnitude.  The  jump  magnitude  is  positive  if 
the  function  jumps  up  (when  moving  from  left  to  right)  at  xk  and  negative  if  it  jumps 
down.  If  the  jump  magnitude  vanishes,  [3k  =  0,  the  left-  and  right-hand  limits  agree, 
and  the  discontinuity  is  removable ,  since  redefining  f(xk)  =  f(xk)  =  f{xk)  makes  f(x) 
continuous  at  x  =  xk.  Since  removable  discontinuities  have  no  effect  in  either  the  theory 
or  applications,  they  can  always  be  removed  without  penalty. 

The  simplest  example  of  a  piecewise  continuous  function  is  the  unit  step  function 


f  1,  x  >  0, 

[  0,  x  <  0, 


(3.46) 


graphed  in  Figure  3.3.  It  has  a  single  jump  discontinuity  at  x  =  0  of  magnitude  1: 


ct(0+)  —  cr(0-)  =  1  —  0  =  1, 

and  is  continuous  —  indeed,  locally  constant  —  everywhere  else.  If  we  translate  and  scale 
the  step  function,  we  obtain  a  function 


h(x)  =  (3  <j(x 


x  >  e 
x  <  e 


(3.47) 


with  a  single  jump  discontinuity  of  magnitude  (3  at  the  point  x  = 

If  f(x)  is  any  piecewise  continuous  function  on  [  —  7r ,  tt  ] ,  then  its  Fourier  coefficients 
are  well  defined  —  the  integrals  (3.35)  exist  and  are  finite.  Continuity,  however,  is  not 
enough  to  ensure  convergence  of  the  associated  Fourier  series. 


Definition  3.7.  A  function  f(x)  is  called  piecewise  C1  on  an  interval  [a,  b]  if  it  is 
defined,  continuous,  and  continuously  differentiable  except  at  a  finite  number  of  points 
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Figure  3.4.  Piecewise  C1  function. 


a  <  xx  <  x2  <  •  •  •  <  xn  <  b.  At  each  exceptional  point,  the  left-  and  right-hand  limits’*'  of 
both  the  function  and  its  derivative  exist: 


/  0fc  )  =  lim  f(x) 


f(xt)=  lim,  /(x) 


X— >X 


k 


x— >x 


k 


f'(xk)  =  lim  f'(x) 


f'(xk)=  lim.  f'(x) 


x— >x 


k 


x— >x 


k 


See  Figure  3.4  for  a  representative  graph.  For  a  piecewise  C1  function,  an  exceptional 
point  xk  is  either 

•  a  jump  discontinuity  where  the  left-  and  right-hand  derivatives  exist,  or 

•  a  corner ,  meaning  a  point  where  /  is  continuous,  so  f(xk)  =  f(xk),  but  has  different 

left-  and  right-hand  derivatives:  /  \xk)  +  f'(xi )• 

Thus,  at  each  point,  including  jump  discontinuities,  the  graph  of  f(x)  has  well-defined 
right  and  left  tangent  lines.  For  example,  the  function  f{x)  =  |  x  \  is  piecewise  C1,  since  it 
is  continuous  everywhere  and  has  a  corner  at  x  =  0,  with//(0+)  =  +l,//(0-)  =  -l. 

There  is  an  analogous  definition  of  piecewise  Cn  functions.  One  requires  that  the 
function  have  n  continuous  derivatives,  except  at  a  finite  number  of  points.  Moreover, 
at  every  point,  the  function  must  have  well  defined  left-  and  right-hand  limits  of  all  its 
derivatives  up  to  order  n. 

Finally,  a  function  f(x)  defined  for  all  x  E  M  is  piecewise  continuous  (or  C1  or  Cn) 
provided  it  is  piecewise  continuous  (or  C1  or  Cn)  on  any  bounded  interval.  Thus,  a  piecewise 
continuous  function  on  M  can  have  an  infinite  number  of  discontinuities,  but  they  are  not 
allowed  to  accumulate  at  any  finite  limit  point.  In  particular,  a  27r-periodic  function  f(x) 
is  piecewise  continuous  if  and  only  if  it  is  piecewise  continuous  on  the  interval  [  — 7r,7r]. 


Exercises 

3.2.14.  Find  the  discontinuities  and  the  jump  magnitudes  for  the  following  piecewise  continu¬ 
ous  functions: 


As  before,  at  the  endpoints  we  require  only  the  appropriate  one-sided  limits,  namely  /(a+), 
/7(a+),  and  to  exist. 
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(a)  2<j{x)  +  a(x  +  1)  —  3cr(x  —  1),  (b)  sign(x2  —  2x),  (c)  cr(x2  —  2x),  (d)  |  x2  —  2x 


(e) 


X 


2 1 ,  (f)  cr(sinx),  (g)  sign(sinx),  (h)  |sinx|,  (i)  (j)  cr(ex),  (k)  e^x  2L 


3.2.15.  Graph  the  following  piecewise  continuous  functions.  List  all  discontinuities  and  jump 


magnitudes. 


(a) 


e 

0. 


X 


1  < 


X 


<  2. 


(d) 


f  X 

X 

l  X  , 

X 

otherwise, 


<  1, 
>  1, 


(b) 


sinx,  0  <  x  <  7r, 

0,  otherwise, 


smx 


0  < 


(c)  < 


1, 

0, 


X 


X 


<  2i r. 


x  —  0, 
otherwise, 


(e) 


—  1  <  x  <  0, 
0  <  X  <  7T, 
otherwise, 


(0 


1 

2 


1  +  £ 


2  ’ 


>  i, 

<  l. 


3.2.16.  Are  the  functions  in  Exercises  3.2.14  and  3.2.15  piecewise  C1?  If  so,  list  all  corners. 

.  (x-zy 

3.2.17.  Prove  that  the  nth  order  ramp  function  p 


Ck  for  any  k  >  0. 


n\ 


0. 


x  >  £, 
x  <C, 


is  piecewise 


3.2.18.  Is  x1/3  piecewise  continuous?  piecewise  C1?  piecewise  C2? 


3.2.19.  Answer  Exercise  3.2.18  for 

1 


(a) 


x 


1 


(d)  x ^  sin  —  ,  (e) 


x 


(f) 


(b)  (c)  e-V*',  ,, 

3.2.20.  (a)  Give  an  example  of  a  function  that  is  continuous  but  not  piecewise  C1 
(b)  Give  an  example  that  is  piecewise  C1  but  not  piecewise  C2. 


x 


3/2 


3.2.21.  (a)  Prove  that  the  sum  /  +  g  of  two  piecewise  continuous  functions  is  piecewise  contin¬ 
uous.  (b)  Where  are  the  jump  discontinuities  of  /  +  gl  What  are  the  jump  magnitudes? 

(c)  Check  your  result  by  summing  the  functions  in  parts  (a)  and  (b)  of  Exercise  3.2.14. 

3.2.22.  Give  an  example  of  two  piecewise  continuous  (but  not  continuous)  functions  f,g  whose 
sum  /  +  g  is  continuous.  Can  you  characterize  all  such  pairs  of  functions? 

^  3.2.23.  (a)  Prove  that  if  f(x)  is  piecewise  continuous  on  [  —  tt,  tt ] ,  then  its  27r-periodic  extension 
is  piecewise  continuous  on  all  of  R.  Where  are  its  jump  discontinuities  and  what  are  their 
magnitudes?  (b)  Similarly,  prove  that  if  f(x)  is  piecewise  C1,  then  its  periodic  extension  is 
piecewise  C1.  Where  are  the  corners? 

3.2.24.  True  or  false:  (a)  If  f(x)  is  a  piecewise  continuous  function,  its  absolute  value  |  f(x)  |  is 
piecewise  continuous.  If  true,  what  are  the  jumps  and  their  magnitudes? 

(b)  If  f(x)  is  piecewise  C1,  then  |  f{x)  \  is  piecewise  C1.  If  true,  what  are  the  corners? 


The  Convergence  Theorem 


We  are  now  able  to  state  the  fundamental  convergence  theorem  for  Fourier  series.  But  we 
will  postpone  a  discussion  of  its  proof  until  the  end  of  Section  3.5. 

Theorem  3.8.  If  f(x)  is  a  2  7 r-periodic,  piecewise  C1  function,  then,  at  any  xgR, 
its  Fourier  series  converges  to 


f  (x), 

\[f{x+)  +  7(x~)], 


if  f  is  continuous  at  x, 
if  x  is  a  jump  discontinuity. 
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Thus,  the  Fourier  series  converges,  as  expected,  to  f(x)  at  all  points  of  continuity. 
At  discontinuities,  it  apparently  can’t  decide  whether  to  converge  to  the  left-  or  right- 
hand  limit,  and  so  ends  up  ^‘splitting  the  difference”  by  converging  to  their  average;  see 
Figure  3.5.  If  we  redefine  f{x)  at  its  jump  discontinuities  to  have  the  average  limiting 
value,  so 

(3.48) 


f{x)  =  \  [f(x+)  +  f{x  ) 

—  an  equation  that  automatically  holds  at  all  points  of  continuity  —  then  Theorem  3.8 

would  say  that  the  Fourier  series  converges  to  the  2  7r-periodic  piecewise  C1  function  f(x) 
everywhere. 

Example  3.9.  Let  cr(x)  denote  the  unit  step  function  (3.46).  Its  Fourier  coefficients 
are  easily  computed: 


ao  =  ~ 

7T 


i  r  i  n 

-  /  <j(x)  dx  =  —  I  dx  =  1 

J  —  7T  ^  J  0 


a 


k 


bk  =  ~ 
7T 


i  r  i  r 

—  /  a(x)  coskx  dx  =  —  /  cos  kxdx 

^  J  —  7T  ^  J  0 

i  r  i  r 

-  /  cr(x)  sin  kxdx  =  —  /  sin  kxdx 

J  —  7T  ^  J  0 


=  0. 


k  7T  ’ 

0, 


Therefore,  the  Fourier  series  for  the  step  function  is 


k  =  21  +  1  odd, 


k  =  21  even. 


a(x) 


1  2  /  sin3x  sin  5  a;  sin7x 

—  +  —  [  smi  +  -  +  -  +  -  + 

2  t r  \  3  5  7 


(3.49) 


According  to  Theorem  3.8,  the  Fourier  series  will  converge  to  its  27r-periodic  extension, 


0. 


a(x)  =  {  1 


1 

1 

2  ’ 


(2 m  —  l)  tt  <  x  <  2 rriTT , 
2mix  <  x  <  (2m  +  1) 7r, 
x  =  m7r. 


where  m  is  any  integer. 


which  is  plotted  in  Figure  3.6.  Observe  that,  in  accordance  with  Theorem  3.8,  cr(x)  takes 
the  midpoint  value  |  at  the  jump  discontinuities  0,±7r,±27r,.... 
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1 


—  7 r 


7 r  27 r  3tt 

Figure  3.6.  27r-periodic  step  function. 


47 r 


y\/\/WNAArtAAAAAAA/N/\/\/y 

It  is  instructive  to  investigate  the  convergence  of  this  particular  Fourier  series  in  some 
detail.  Figure  3.7  displays  a  graph  of  the  first  few  partial  sums,  taking,  respectively, 
n  =  4, 10,  and  20  terms.  The  reader  will  notice  that  away  from  the  discontinuities,  the 
series  indeed  appears  to  be  converging,  albeit  slowly.  However,  near  the  jumps  there  is  a 
consistent  overshoot  of  about  9%  of  the  jump  magnitude.  The  region  where  the  overshoot 
occurs  becomes  narrower  and  narrower  as  the  number  of  terms  increases,  but  the  actual 
amount  of  overshoot  persists  no  matter  how  many  terms  are  summed  up.  This  was  first 
noted  by  the  American  physicist  Josiah  Gibbs,  and  is  now  known  as  the  Gibbs  phenomenon 
in  his  honor.  The  Gibbs  overshoot  is  a  manifestation  of  the  subtle  nonuniform  convergence 
of  the  Fourier  series. 


Exercises 


3.2.25.  (a)  Sketch  the  27r-periodic  half-wave  f(x) 


sinx, 

0, 


0  <  X  <  7T, 
—  7T  <  X  <  0. 


(b)  Find  its 


Fourier  series,  (c)  Graph  the  first  five  Fourier  sums  and  compare  with  the  function, 
(d)  Discuss  convergence  of  the  Fourier  series. 


3.2.26.  Answer  Exercise  3.2.25  for  the  cosine  half-wave  f(x) 


COSX,  0  <  X  <  7T, 

0,  —7 r  <  x  <  0. 


3.2.27.  (a)  Find  the  Fourier  series  for  f(x)  =  ex .  (b)  For  which  values  of  x  does  the  Fourier 
series  converge?  (c)  Graph  the  function  it  converges  to. 

3.2.28.  (a)  Use  a  graphing  package  to  investigate  the  Gibbs  phenomenon  for  the  Fourier  series 
(3.37)  of  the  function  x.  Determine  the  amount  of  overshoot  of  the  partial  sums  at  the  dis¬ 
continuities.  (b)  How  many  terms  do  you  need  to  approximate  the  function  to  within  two 
decimal  places  at  x  =  2.0?  At  x  =  3.0? 
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3.2.29.  Use  the  Fourier  series  (3.49)  for  the  step  function  to  rederive  Gregory’s  series  (3.43). 

0  3.2.30.  Suppose  ak,bk  are  the  Fourier  coefficients  of  the  function  f{x).  (a)  To  which  function 


a 

does  the  Fourier  series  +  E  [  ak  cos  2 kx  +  bk  sin  2k x]  converge?  Hint :  The  answer  is 

2  k  =  l 

not  f(2x).  (b)  Test  your  answer  with  the  Fourier  series  (3.37)  for  f(x)  =  x. 


oo 


Even  and  Odd  Functions 


We  already  noted  that  the  Fourier  cosine  coefficients  of  the  function  f(x)  =  x  are  all  0. 
This  is  not  an  accident,  but,  rather,  a  consequence  of  the  fact  that  x  is  an  odd  function. 
Recall  first  the  basic  definition: 

Definition  3.10.  A  function  is  called  even  if /(  —  x)  =  f(x).  A  function  is  called  odd 
if  f(~x)  =  —f(x). 

For  example,  the  functions  1,  cos  kx,  and  x2  are  all  even,  whereas  x,  sin  kx,  and  sign  a; 
are  odd.  Note  that  an  odd  function  necessarily  has  /( 0)  =  0.  We  require  three  elementary 
lemmas,  whose  proofs  are  left  to  the  reader. 

Lemma  3.11.  The  sum ,  f(x)  +  g{x),  of  two  even  functions  is  even;  the  sum  of  two 
odd  functions  is  odd. 


Remark :  Every  function  can  be  represented  as  the  sum  of  an  even  and  an  odd  function; 
see  Exercise  3.2.32. 


Lemma  3.12.  The  product  f(x)  g(x)  of  two  even  functions ,  or  of  two  odd  functions , 
is  an  even  function.  The  product  of  an  even  and  an  odd  function  is  odd. 


Lemma  3.13.  If  f(x)  is  odd  and  integrable  on  the  symmetric  interval  [  —  a,  a], 

a  pa  pa 

f(x)dx  =  0.  If  f(x)  is  even  and  integrable ,  then  /  f(x)dx  =  2  /  f{x)  dx. 


then 


J  —  a  J  —  a  J  0 

The  next  result  is  an  immediate  consequence  of  applying  Lemmas  3.12  and  3.13  to  the 
Fourier  integrals  (3.35). 

Proposition  3.14.  If  f{x)  is  even ,  then  its  Fourier  sine  coefficients  all  vanish ,  bk  =  0, 
and  so  f(x)  can  be  represented  by  a  Fourier  cosine  series 


oo 

f(x)  ~  y  +  E  akcoskx’ 

k=  1 


(3.50) 


where 


2  r 

ak  =  —  /  f{x)  cos  kxdx,  k  =  0, 1,  2,  3, . . .  .  (3.51) 

^  Jo 

If  fix)  is  odd,  then  its  Fourier  cosine  coefficients  vanish,  ak  =  0,  and  so  f{x)  can  be 
represented  by  a  Fourier  sine  series 


oo 

f(x)  E  bk  sin  kx , 

k=  1 


(3.52) 
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Figure  3.8. 


27r-periodic  extension  of 


where 


>7T 


bk  =  —  /  f(x)  sin  kx  dx. 

n  Jo 


k  =  1,2,3, 


(3.53) 


Conversely,  a  convergent  Fourier  cosine  series  always  represents  an  even  function,  while  a 
convergent  sine  series  always  represents  an  odd  function. 


Example  3.15.  The  absolute  value  f{x) 
Fourier  cosine  series.  The  coefficients  are 


x 


is  an  even  function,  and  hence  has  a 


*7T 


a 


o 


X  dx  =  7T. 


7 r 


(3.54) 


o 


ak  ~ 


2  r  ,  ,  2 

x  cos  kx  dx  =  — 

^  Jo  n 


xsinkx  cos  kx 

-  + 


k 


k 2 


1  7T 


-  x  —  0 


0. 


k 2 


0  7^  k  even, 
k  odd. 


7 r 


Therefore 


x 


7 r  4  /  cos3x  cos5x  cos7x 

~  -  -  —  (  cosx  4 - - - 1 - — - h  — — - h 

2  7F 


9 


25 


49 


(3.55) 


According  to  Theorem  3.8,  this  Fourier  cosine  series  converges  to  the  27r-periodic  extension 
of  |  x  |,  the  “sawtooth  function”  graphed  in  Figure  3.8. 

In  particular,  if  we  substitute  x  =  0,  we  obtain  another  interesting  series: 


7l  _  1  1  1 

~8  ~  +9  +  25  +  49  + 


OO 


-  E 


3  =  0 


(2j  +  l) 


2  ' 


(3.56) 


It  converges  faster  than  Gregory’s  series  (3.43),  and,  while  far  from  optimal  in  this  regard, 
can  be  used  to  compute  reasonable  approximations  to  tt.  One  can  further  manipulate  this 
result  to  compute  the  sum  of  the  series 


OO 


S=E  p- 


k=  1 


111111 

+  4  +  9  +  l6  +  25  +  36  +  49  + 


We  note  that 


OO  OO 

O  v  ^  1  \  ^  1 

~4  ~  2-^  4^2  —  2_^  751 

k=  1 


k  =  1 


(20: 


1111 
T  +  7T  +  ^  +  rr  + 


16 


36  64 


Therefore,  by  (3.56) 


3 

—  S  —  S 

4 


5,11  1 

—  —  IT  —  T  —  T  —  T 
4  9  25  49 


7 r 


8 
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from  which  we  conclude  that 


o  1  -  1  1  1 

'  k2  4  9  li 


k=  1 


1 

16  +  25 


+ 


TT 


6 


(3.57) 


Remark :  The  most  famous  function  in  number  theory  —  and  the  source  of  the  most 
outstanding  problem  in  mathematics,  the  Riemann  hypothesis  —  is  the  Riemann  zeta 
function 


oo 


««)  =  ZF. 


k=  1 


(3.58) 


Formula  (3.57)  shows  that  £(2) 


6*‘ 


In  fact,  the  value  of  the  zeta  function  at  any  even 


positive  integer  s  =  2  j  is  a  rational  polynomial  in  7r,  [9].  Because  of  its  importance  to  the 
study  of  prime  numbers,  locating  all  the  complex  zeros  of  the  zeta  function  will  earn  you 
$1,000,000  -  see  http :  //www .  claymath .  org  for  details. 

Any  function  f(x)  defined  on  [ 0,  tt ]  has  a  unique  even  extension  to  [  —  tt,  tt  ] ,  obtained 
by  setting  f(—x)  =  f(x)  for  —  tt  <  x  <  0,  and  also  a  unique  odd  extension,  where  now 
f{—x)  =  —  f{x)  and  /( 0)  =  0.  These  in  turn  can  be  periodically  extended  to  the  entire  real 
line.  The  Fourier  cosine  series  of  f(x)  is  defined  by  the  formulas  (3.50-51),  and  represents 
the  even,  27r-periodic  extension.  Similarly,  the  formulas  (3.52-53)  define  the  Fourier  sine 
series  of  /(#),  representing  its  odd,  27r-periodic  extension. 

Example  3.16.  Suppose  f(x)  =  sin  ax  Its  Fourier  cosine  series  has  coefficients 


*7T 


ak  =  ~ 
7 r 


sin  x  cos  kxdx  — 


o 


(1  —  k 2)  7 r 

0, 


k  even, 
k  odd. 


The  resulting  cosine  series  represents  the  even,  27r-periodic  extension  of  sinx,  namely 


sinx 


2 

r^sj  — 

7 r 


cos  2  j  x 

4 j2  -  1  ' 


On  the  other  hand,  f(x)  =  sinx  is  already  odd,  and  so  its  Fourier  sine  series  coincides  with 
its  ordinary  Fourier  series,  namely  sinx,  all  the  other  Fourier  sine  coefficients  being  zero; 
in  other  words,  b±  =  1,  while  bk  =  0  for  k  >  1. 


Exercises 


3.2.31.  Are  the  following  functions  even,  odd,  or  neither? 

(a)  x2,  (b)  ex ,  ( c )  sinhx,  (d)  sin  ttx,  (e)  —  ,  (f)  - ^ 

x  1  +  xz 


"i 

(g)  tan-  x. 


3.2.32.  Prove  that  (a)  the  sum  of  two  even  functions  is  even;  (b)  the  sum  of  two  odd  functions 
is  odd;  (c)  every  function  is  the  sum  of  an  even  and  an  odd  function. 

0  3.2.33.  Prove  (a)  Lemma  3.12;  (b)  Lemma  3.13. 
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3.2.34.  If  f(x)  is  odd,  is  f'(x)  (i)  even?  (ii)  odd?  (Hi)  neither?  (iv)  could  be  either? 

3.2.35.  If  f'(x)  is  even,  is  f(x)  (i)  even?  (ii)  odd?  (Hi)  neither?  (iv)  could  be  either?  How 
do  you  reconcile  your  answer  with  Exercise  3.2.34? 

3.2.36.  Answer  Exercise  3.2.34  for  f"(x). 

3.2.37.  True  or  false:  (a)  If  f(x)  is  odd,  its  27r-periodic  extension  is  odd. 

(b)  If  the  27r-periodic  extension  of  f(x)  is  odd,  then  f(x)  is  odd. 

3.2.38.  Let  f(x)  denote  the  odd,  27r-periodic  Fourier  extension  of  a  function  f(x)  defined  on 
[ 0,  7T ] .  Explain  why  f(k tt)  =0  for  any  integer  k. 

3.2.39.  Construct  and  graph  the  even  and  odd  27r-periodic  extensions  of  the  function  f(x)  = 
1  —  x.  What  are  their  Fourier  series?  Discuss  convergence  of  each. 


3.2.40.  Find  the  Fourier  series  and  discuss  convergence  for:  (a)  the  box  function 


b(x)  = 


x 


<  ^7T, 


b r  < 


x 


<  7 r. 


(b)  the  hat  function  h(x)  = 


1 1_ 

X 

5 

X 

l  0, 

1  < 

<  1, 


x 


<  7 T. 


3.2.41.  Find  the  Fourier  sine  and  cosine  series  of  the  following  functions.  Then  graph  the  func- 

Q 

t ion  to  which  the  series  converges,  (a)  1,  (b)  cosx,  (c)  sin'  x,  (d)  x(i t  —  x). 


3.2.42.  Find  the  Fourier  series  of  the  hyperbolic  functions  coshmx  and  sinhmx. 

3.2.43.  Use  the  Fourier  cosine  series  of  the  function  |  sinx  |  constructed  in  Example  3.16  to 

oo  oo 

evaluate  the  sums  ^  (4k2  —  1)_1  and  ^  (— 1)~  (4k2  —  1)_1 . 

k  —  1  k  —  1 


3.2.44.  True  or  false:  The  sum  of  the  Fourier  cosine  series  and  the  Fourier  sine  series  of  the 
function  f(x)  is  the  Fourier  series  for  f(x).  If  false,  what  function  is  represented  by  the 
combined  Fourier  series? 


3.2.45.  (a)  Show  that  if  a  function  is  periodic  of  period  tt,  then  its  Fourier  series  contains  only 
even  terms,  i.e.,  ak  =  bk  =  0  whenever  k  =  2  j  +  1  is  odd.  (b)  What  if  the  period  is  r? 

3.2.46.  Under  what  conditions  on  f(x)  does  its  Fourier  sine  series  contain  only  even  terms,  i.e., 
its  Fourier  sine  coefficients  bk  =  0  whenever  k  is  odd? 

4b  3.2.47.  Graph  the  partial  sums  s3(x),  s5(x),  s10(x)  of  the  Fourier  series  (3.55).  Do  you  notice  a 
Gibbs  phenomenon?  If  so,  what  is  the  amount  of  overshoot?  If  not,  explain  why. 


3.2.48.  Explain  why,  in  the  case  of  the  step  function  a(x),  all  its  Fourier  cosine  coefficients  van¬ 
ish,  ak  =  0,  except  for  a0  =  1. 

4b  3.2.49.  How  many  terms  do  you  need  to  sum  in  (3.56)  to  correctly  approximate  tt  to  two  deci¬ 
mal  digits?  To  ten  digits? 


3.2.50.  Prove  that 


oo 


E 

k=  1 


MU1 

k2 


111111 

4  +  9-  l6  +  E-  36  +  49 


TT 


2 


Complex  Fourier  Series 

An  alternative,  and  often  more  convenient,  approach  to  Fourier  series  is  to  use  complex 
exponentials  instead  of  sines  and  cosines.  Indeed,  Euler’s  formula 

e±kx  _  cog kx  +  i  sin kx,  e~ 1  kx  =  cos kx  —  i  sin kx,  (3.59) 


3.2  Fourier  Series 
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shows  how  to  write  the  trigonometric  functions 


cos  kx  = 


g  i  kx  _|_  g—  i  kx 


sin  kx  = 


g  i  kx  _  g—  i  kx 


(3.60) 

2  7  2  i 

in  terms  of  complex  exponentials,  and  so  we  can  easily  go  back  and  forth  between  the  two 
representations. 

Like  their  trigonometric  antecedents,  complex  exponentials  are  also  endowed  with  an 
underlying  orthogonality.  But  here,  since  we  are  dealing  with  the  vector  space  of  complex¬ 


valued  functions  on  the  interval 
product 


—  7T,  7T 


we  need  to  use  the  rescaled  L2  Hermitian  inner 


*7 r 


f  ,g)  = 


2n 


f{x)  g(x)  dx  , 


(3.61) 


-7T 


in  which  the  second  function  acquires  a  complex  conjugate,  as  indicated  by  the  overbar. 
This  is  needed  to  ensure  that  the  associated  L2  Hermitian  norm 


f  11  = 


27 r 


*7 r 


-7T 


f(x)  |2  dx 


(3.62) 


is  real  and  positive  for  all  nonzero  complex  functions:  \\  f  \\  >  0  when  /  ^  0.  Orthonor¬ 
mality  of  the  complex  exponentials  is  proved  by  direct  computation: 


i  kx  „\lx 


i  kx 


27 r 

1 

2tt 


*7T 

—  7T 
*7T 

-7T 


,,  ,1,  fc  =  Z, 

i(k-l)x  dx  = 


0. 


k 


(3.63) 


i  kx 


2  dx  —  1 . 


The  complex  Fourier  series  for  a  (piecewise  continuous)  real  or  complex  function  /  is 
the  doubly  infinite  series 


oo 


/(*) 


rsj 


E 

k  =  — oo 


+  c_2e  2l;r+c_1e' 


1  X 


+  c0  +  c1elx+c2  + 


2  i  x 


(3.64) 


The  orthonormality  formulae  (3.63)  imply  that  the  complex  Fourier  coefficients  are  ob¬ 
tained  by  taking  the  inner  products 


c 


k 


f  H 


i  k  x 


1 


*7T 


27T 


fix)  e 


i  k  x 


dx. 


(3.65) 


-7T 


Pay  particular  attention  to  the  minus  sign  appearing  in  the  integrated  exponential,  which 
happens  because  the  second  argument  in  the  Hermitian  inner  product  (3.61)  requires  a 
complex  conjugate. 

It  must  be  emphasized  that  the  real  (3.34)  and  complex  (3.64)  Fourier  formulae  are  just 
two  different  ways  of  writing  the  same  series!  Indeed,  if  we  substitute  Euler’s  formula  (3.59) 
into  (3.65)  and  compare  the  result  with  the  real  Fourier  formulae  (3.35),  we  find  that  the 
real  and  complex  Fourier  coefficients  are  related  by 


a 


k 


ck  + c- 


■ki 


K  =  i  (cfc  -C_fc), 


ck  =  Hak  -  i  h) 
c_k  =  Uak+  i  bk) 


k  =  0,1,2, 


(3.66) 
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Figure  3.9.  27r-periodic  extension  of  ex . 


Remark :  We  already  see  one  advantage  of  the  complex  version.  The  constant  function 
1  =  e° 1  x  no  longer  plays  an  anomalous  role  —  the  annoying  factor  of  |  in  the  real  Fourier 
series  (3.34)  has  mysteriously  disappeared! 

Example  3.17.  For  the  unit  step  function  cr(x)  considered  in  Example  3.9,  the 
complex  Fourier  coefficients  are 


1 

27 r 


/  l 

2  ’ 


< 


0, 


1 


i  k  77 


Therefore,  the  step  function  has  the  complex  Fourier  series 


cr(x) 


1 

2 


oo 


1  E 

77  ^ 


l  =  —  oo 


,(2  Z  +  l)  i  x 

21  +  1 


k  =  0, 

0  7^  k  even, 

k  odd. 


(3.67) 


You  should  convince  yourself  that  this  is  exactly  the  same  series  as  the  real  Fourier  series 
(3.49).  We  are  merely  rewriting  it  using  complex  exponentials  instead  of  real  sines  and 
cosines. 


Example  3.18.  Let  us  fold  the  Fourier  series  for  the  exponential  function  eax .  It  is 
much  easier  to  evaluate  the  integrals  for  the  complex  Fourier  coefficients,  and  so 


*7T 


_  /  ax  i  kx 

ck  ~  \  e  + 


2n 


e(a-ik)xdx  = 


^(a—  i  k)  x  77 


—  7T 


27 r(a  —  i  k) 


,(a—  i  k)  ir  _  p—  (a—  i  k)  tt 


=  (-i) 


k 


g  CL  7 T  g  CL  77 


X  =  — 7T 

k 


2  77  (a  —  ik)  2  77  (a  —  ik) 

Therefore,  the  desired  Fourier  series  is 


(— 1  ffi(a+  i/c)sinha7r 
tt  (a2  +  k2) 


.  ,  oo 

smh  a  7r  y- 
- - ^ 


77 


k  =  — oo 


(— l)fc(a  +  ifc)  ik 
a 2  +  k2 


X 


(3.68) 
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As  an  exercise,  the  reader  should  try  writing  this  as  a  real  Fourier  series,  either  by  breaking 
up  the  complex  series  into  its  real  and  imaginary  parts,  or  by  direct  evaluation  of  the  real 
coefficients  via  their  integral  formulae  (3.35).  According  to  Theorem  3.8  (which  is  equally 
valid  for  complex  Fourier  series),  the  Fourier  series  converges  to  the  27r-periodic  extension 
of  the  exponential  function,  as  graphed  in  Figure  3.9.  In  particular,  its  values  at  odd 
multiples  of  n  is  the  average  of  the  limiting  values  there,  namely  cosha7r  =  |(ea7r  +  e_a7r). 


Exercises 


3.2.51.  Find  the  complex  Fourier  series  of  the  following  functions:  (a)  sinx,  (b)  sin3  x, 

x ,  x  >  0, 
0,  x  <  0. 


(c)  X,  (d)  x\,  (e)  |  sin#  |,  (f)  signx,  (g)  the  ramp  function  p(x) 


3.2.52.  Let  —  n  <  £  <  tt.  Determine  the  complex  Fourier  series  for  the  shifted  step  function 
a(x  —  £),  and  graph  the  function  it  converges  to. 


3.2.53.  Let  a  e  M.  Find  the  real  form  of  the  Fourier  series  for  the  exponential  function  eax 

(a)  by  breaking  up  the  complex  series  (3.68)  into  its  real  and  imaginary  parts; 

(b)  by  direct  evaluation  of  the  real  coefficients  via  their  integral  formulae  (3.35). 

Make  sure  that  your  results  agree! 


1  2 

3.2.54.  Prove  that  coth7r  =  — | - 

7T  7 r 


1 


+ 


1 


+ 


1 


+ 


where 


cothx 


coshx 
sinh  x 


ry  _  ry 

e  +  e 

_  ry 

^  _  0  '  X  J 


1  +  l2  1  +  22  1  +  32 

is  the  hyperbolic  cotangent  function. 


3.2.55. 


(a)  Find  the  complex  Fourier  series  for  xelx. 

(b)  Use  your  result  to  write  down  the  real  Fourier  series  for  xcosx  and  xsinx. 


3.2.56.  Prove  that  if  f(x)  = 


£ 


rke 


i  k  x 


is  a  complex  trigonometric  polynomial,  with 


k  —  m 

—  oc  <  m  <  n  <  oc,  then  its  Fourier  coefficients  are  ck  = 


rk,  rri  <  k  <  n, 
0,  otherwise. 


3.2.57.  True  or  false:  If  the  complex  function  f(x)  =  g(x)  +  i  h(x)  has  Fourier  coefficients  cfc, 
then  g(x)  =  R ef(x)  and  h(x)  =  lmf(x)  have,  respectively,  complex  Fourier  coefficients 
R eck  and  Im ck. 

3.2.58.  Let  f{x)  be  27r-periodic.  Explain  how  to  construct  the  complex  Fourier  series  for 
f(x  —  a)  from  that  of  f(x). 

3.2.59.  (a)  Show  that  if  ck  are  the  complex  Fourier  coefficients  for  /(#),  then  the  Fourier  coef¬ 
ficients  of  f(x)  =  f(x)  elx  are  ck  =  ck_1.  (b)  Let  m  be  an  integer.  Which  function  has 
complex  Fourier  coefficients  ck  =  cfc+m?  (c)  If  ak,  bk  are  the  Fourier  coefficients  of  the  real 
function  /(#),  what  are  the  Fourier  coefficients  of  f(x)  cosx  and  f(x)  sinx? 


3.2.60.  Can  you  recognize  whether  a  function  is  real  by  looking  at  its  complex  Fourier  coeffi¬ 
cients? 


^  3.2.61.  Can  you  characterize  the  complex  Fourier  coefficients  of  an  even  function? 
an  odd  function? 

oo 

<J)  3.2.62.  What  does  it  mean  for  a  doubly  infinite  series  £  ck  to  converge?  Be  precise! 

k  =  —oo 
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3.3  Differentiation  and  Integration 


Under  appropriate  hypotheses,  if  a  series  of  functions  converges,  then  one  will  be  able 
to  integrate  or  differentiate  it  term  by  term,  and  the  resulting  series  should  converge  to 
the  integral  or  derivative  of  the  original  sum.  For  example,  integration  and  differentiation 
of  power  series  is  always  valid  within  the  range  of  convergence,  and  is  used  extensively 
in  the  construction  of  series  solutions  of  differential  equations,  series  for  integrals  of  non¬ 
elementary  functions,  and  so  on.  (See  Section  11.3  for  further  details.)  The  convergence 
of  Fourier  series  is  considerably  more  delicate,  and  so  one  must  exercise  due  care  when 
differentiating  or  integrating.  Nevertheless,  in  favorable  situations,  both  operations  lead  to 
valid  results,  and  are  quite  useful  for  constructing  Fourier  series  of  more  intricate  functions. 


Integration  of  Fourier  Series 


Integration  is  a  smoothing  operation  —  the  integrated  function  is  always  nicer  than  the 
original.  Therefore,  we  should  anticipate  being  able  to  integrate  Fourier  series  without 
difficulty.  There  is,  however,  one  complication:  the  integral  of  a  periodic  function  is  not 
necessarily  periodic.  The  simplest  example  is  the  constant  function  1,  which  is  certainly 
periodic,  but  its  integral,  namely  ay  is  not.  On  the  other  hand,  integrals  of  all  the  other 
periodic  sine  and  cosine  functions  appearing  in  the  Fourier  series  are  periodic.  Thus,  only 
the  constant  term 


a 


*7 r 


0 


27 r 


f(x)  dx 


(3.69) 


—  7 T 


might  cause  us  difficulty  when  we  try  to  integrate  a  Fourier  series  (3.34).  Note  that  (3.69) 
is  the  mean ,  or  average ,  of  the  function  f(x)  over  the  interval  [  —  tt,  tt ] ,  and  so  a  function 
has  no  constant  term  in  its  Fourier  series,  i.e.,  a0  =  0,  if  and  only  if  it  has  mean  zero.  It 
is  easily  shown,  cf.  Exercise  3.2.10,  that  the  mean-zero  functions  are  precisely  those  that 
remain  periodic  upon  integration.  In  particular,  Lemma  3.13  implies  that  all  odd  functions 
automatically  have  mean  zero,  and  hence  have  periodic  integrals. 

/nX 

Lemma  3.19.  If  f(x )  is  2 7 r-periodic,  then  its  integral  g(x)  =  /  f(y)dy  is  2i r- 

7T  JO 

f(x)  dx  =  0,  so  that  f  has  mean  zero  on  the  interval  [  —  7T,  7r  . 

-7 r 


In  view  of  the  elementary  integration  formulae 


cos  kx  dx  = 


sin  kx 


sin  kx  dx  =  — 


cos  kx 


k  1  J  k  1 

termwise  integration  of  a  Fourier  series  without  constant  term  is  straightforward. 


(3.70) 


Theorem  3.20.  If  f  is  piecewise  continuous  and  has  mean  zero  on  the  interval 

—  TT,  7T 


,  then  its  Fourier  series 

fV) 


OO 


r<j 


^  [  ak  cos  k  x  +  bk  sin  k  x 


k=  1 


can  be  integrated  term  by  term ,  to  produce  the  Fourier  series 


rX  00 

gix)  =  /  f(y)  dy  ~  m  +  W 

do  7„  _  1 


k=  1  L 


b k  7  ak  . 

— f-  cos  kx  F  -f-  sin  k x 
k  k 


(3.71) 
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The  constant  term 


*7 r 


m  = 


2tt 


g(x)  dx 


(3.72) 


—  7T 


is  the  mean  of  the  integrated  function. 

Example  3.21.  The  function  f(x)  =  x  is  odd,  and  so  has  mean  zero: 
Let  ns  integrate  its  Fourier  series 


*7 r 


xdx  —  0. 


-7 T 


OO 


X  ~ 


2  E 

k=  1 


(-i) 


/c-1 


fe 


sin  kx. 


(3.73) 


which  we  found  in  Example  3.3.  The  result  is  the  Fourier  series 


OO 


7 r 


-  X  rsj 

2 


6 


7 r 
6 


-  2  E 


(-i) 


k-1 


k=  1 
2  I  cosx 


/c2 


cos  kx 


cos2x  cos3x 
4  +  _ 9 


cos4x 

16 


(3.74) 


+ 


whose  constant  term  is  the  mean  of  the  left-hand  side: 


1 


2tt  ./ — tt  2 


>7r  X2  ,  7T: 

ax  = 


6 


Let  us  revisit  the  derivation  of  the  integrated  Fourier  series  from  a  slightly  different 
standpoint.  If  we  were  to  integrate  each  trigonometric  summand  in  a  Fourier  series  (3.34) 
from  0  to  x,  we  would  obtain 


* X 


cos  ky  dy 


o 


sin  kx 
k 


* X 


whereas 


sin  ky  dy 


o 


1  cos  kx 
k  h~ 


The  extra  1/k  terms  coming  from  the  definite  sine  integrals  did  not  appear  explicitly  in 
our  previous  expression  for  the  integrated  Fourier  series,  (3.71),  and  so  must  be  hidden  in 
the  constant  term  m.  We  deduce  that  the  mean  value  of  the  integrated  function  can  be 
computed  using  the  Fourier  sine  coefficients  of  /  via  the  formula 


1  f71  /  n  ,  bk 

—  /  q\x)dx  —  m—  >  ffi-. 

^  k 

J_7r  k  =  l 


(3.75) 


For  example,  integrating  both  sides  of  the  Fourier  series  (3.73)  for  f(x)  =  x  from  0  to  x 
produces 

.2  00 


or 


2T. 

k  —  1 


k2 


(1  —  cos  kx). 


The  constant  terms  sum  to  yield  the  mean  value  of  the  integrated  function: 


1  1  1 

1 - 1 - b 

4  9  16 


OO 


2  E 

k=  1 


(-i) 


k-1 


1 


k2 


2tt  7-^  2 


»7 r  9  f 

Xz  7T 

dx  = 


6 


(3.76) 


which  reproduces  a  formula  established  in  Exercise  3.2.50. 
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More  generally,  if  f(x)  does  not  have  mean  zero,  its  Fourier  series  contains  a  nonzero 
constant  term, 


/0) 


a 


oo 


o 


+  [  ak  cos  kx  +  bk  sin  kx  . 


k  =  1 


In  this  case,  the  result  of  integration  will  be 


f(y ) dV 


o 


CLft 

—  x  +  m  +  y  j 

k=  1 


cos  k  x  + 


ak  •  7 

—  sin  kx 
k 


(3.77) 


where  m  is  given  in  (3.75).  The  right-hand  side  is  not,  strictly  speaking,  a  Fourier  series. 
There  are  two  ways  to  interpret  this  formula  within  the  Fourier  framework.  We  can  write 
(3.77)  as  the  Fourier  series  for  the  difference 


oo 


g(x)  - 


a 


o 


x  ~  m 


+  E 


k  =  l  L 


b k  7  . 

cos  k  x  +  —  sin  k  x 
k  k 


(3.78) 


which,  by  Exercise  3.2.10(d),  is  a  27r-periodic  function.  Alternatively,  we  can  replace  x 
by  its  Fourier  series  (3.37),  and  the  result  will  be  the  Fourier  series  for  the  27r-periodic 


‘X 


extension  of  the  integral  g(x)  =  /  f(y)  dy. 


o 


Differentiation  of  Fourier  Series 

Differentiation  has  the  opposite  effect  —  it  makes  a  function  worse.  Therefore,  to  justify 
taking  the  derivative  of  a  Fourier  series,  we  need  to  know  that  the  derived  function  remains 
reasonably  nice.  Since  we  need  the  derivative  f'[x )  to  be  piecewise  C1  for  the  Convergence 
Theorem  3.8  to  be  applicable,  we  require  that  f[x)  itself  be  continuous  and  piecewise  C2. 

Theorem  3.22.  If  f(x)  has  a  piecewise  C2  and  continuous  2^-periodic  extension , 
then  its  Fourier  series  can  be  differentiated  term  by  term ,  to  produce  the  Fourier  series  for 
its  derivative 


oo 


oo 


f'[x )  ^  [  k  bk  cos  k  x  —  k  ak  sin  k  x 


E  ikcke 


i  kx 


(3.79) 


k  =  1 


k  =  — oo 


Example  3.23. 

the  sign  function: 


The  derivative  (6.31)  of  the  absolute  value  function  f{x) 


d 

dx 


x 


signx 


+ 1? 
- 1, 


x  >  0, 

x  <  0. 


x  is 


(3.80) 


Therefore,  if  we  differentiate  its  Fourier  series  (3.55),  we  obtain  the  Fourier  series 


4  (  sin3x  sin5x  sin7x 

signx  ~  —  sin  sc  +  — - - h  — - - V  — - - h 

7r  V  3  5  7 


(3.81) 


Note  that  signx  =  cr(x)  —  cr(—  x)  is  the  difference  of  two  step  functions.  Indeed,  subtracting 
the  step  function  Fourier  series  (3.49)  at  x  from  the  same  series  at  —  x  reproduces  (3.81). 
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Exercises 


3.3.1.  Starting  with  the  Fourier  series  (3.49)  for  the  step  function  cr(x),  use  integration  to: 

x ,  x  >  0, 

0,  x  <  0. 


(a)  Find  the  Fourier  series  for  the  ramp  function  p(x)  — 

( b )  Then,  find  the  Fourier  series  for  the  second-order  ramp  function  p2(x) 


1  ..2 
2 

0, 


2X 


x  >  0, 

x  <  0. 


3.3.2.  Find  the  Fourier  series  for  the  function  f(x)  =  x3.  If  you  differentiate  your  series,  do  you 
recover  the  Fourier  series  for  f'(x)  =  3x2?  If  not,  explain  why  not. 

3.3.3.  Answer  Exercise  3.3.2  when  f(x)  =  x4. 

O  A 

3.3.4.  Use  Theorem  3.20  to  construct  the  Fourier  series  for  (a)  ar,  (b)  x  . 

3.3.5.  Write  down  the  identities  obtained  by  substituting  x  =  0,  \i r,  and  ^tt  in  the  Fourier 
series  (3.74). 

0  3.3.6.  Suppose  f(x)  is  a  27r-periodic  function  with  complex  Fourier  coefficients  cfc,  and  g(x) 

is  a  27r-periodic  function  with  complex  Fourier  coefficients  dk.  (a)  Find  the  Fourier  coeffi- 

/7 r 

f(x  —  y )  g(y)  dy. 

-7 r 

( b )  Find  the  complex  Fourier  series  for  the  periodic  convolution  of  cos3x  and  sin2x. 

(c)  Answer  part  (b)  for  the  functions  x  and  sin2x. 

0  3.3.7.  Suppose  /  is  piecewise  continuous  on  [  —  tt,  tt ] .  Prove  that  the  mean  of  the  integrated 


rX 

function  g(x)  =  /  f(y)  dy  equals 

j  o 


*7T 


2  J  —  7T 


x 


signx - f(x)  dx. 


7 r 


3.3.8.  Suppose  the  27r-periodic  extension  of  f(x)  is  continuous  and  piecewise  C1.  Prove  di¬ 
rectly  from  the  formulas  (3.35)  that  the  Fourier  coefficients  of  its  derivative  J(x)  =  f\x) 
are,  respectively,  ak  =  kbk  and  bk  =  — kak ,  where  ak,bk  are  the  Fourier  coefficients  of  f(x). 


3.3.9.  Explain  how  to  integrate  a  complex  Fourier  series  (3.64).  Under  what  conditions  is  your 
formula  valid? 

d2u  du 

T  3.3.10.  The  initial  value  problem  +  u  =  /(£),  a(0)  =0,  —  (0)  =  0,  describes  the  forced 

motion  of  an  initially  motionless  unit  mass  attached  to  a  unit  spring. 

(a)  Solve  the  initial  value  problem  when  f(t)  =  cos  kt  and  f(t)  =  sin  kt  for  k  =  0, 1, ...  . 

(b)  Assuming  that  the  forcing  function  f(t)  is  27r-periodic,  write  out  its  Fourier  series,  and 
then  use  your  result  from  part  (b)  to  write  out  a  series  for  the  solution  u(t). 

(c)  Under  what  conditions  is  the  result  a  convergent  Fourier  series,  and  hence  the  solution 
u(t)  remains  2  7r-periodic? 

(d)  Explain  why  f(t)  induces  a  resonance  of  the  mass-spring  system  if  and  only  if  its  Fourier 

o  o 

coefficients  of  order  1  are  not  both  zero:  a1-\-b1  ^  0. 


3.4  Change  of  Scale 


So  far,  we  have  dealt  only  with  Fourier  series  on  the  standard  interval  of  length  2n.  We 


chose 


—  7T,  7F 


for  convenience,  but  all  of  the  results  and  formulas  are  easily  adapted  to  any 


other  interval  of  the  same  length,  e.g.,  [ 0,  2  tt ] .  However,  since  physical  objects  like  bars 
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and  strings  do  not  all  come  in  this  particular  length,  we  need  to  understand  how  to  adapt 
the  formulas  to  more  general  intervals. 

Any  symmetric  interval  [  —  £,£]  of  length  2  £  can  be  rescaled  (stretched)  to  the  standard 
interval  [  —  tt,  tt  ]  through  the  linear  change  of  variables 


£ 

x  =  ~  y,  so  that  —  tt  <  y  <  tt  whenever  —  £  <  x  <  £. 


(3.82) 


£ 


Given  a  function  f{x)  defined  on  [  —  £,£],  the  rescaled  function  F(y )  =  /  (  —  y  )  lives  on 

.  Let 


—  7T,  7T 


F(y) 


a 


oo 


o 


+  [  ak  cos  k  y  +  bk  sin  k  y 

k=  1 


*77 


F{y)  cosky  dy. 


be  the  standard  Fourier  series  for  F(y),  so  that 

1 

ak  =  ~ 

7T 

Then,  reverting  to  the  unsealed  variable  x,  we  deduce  that 

a0  x— v  knx 


*77 


-77 


h  =  ~ 

TT 


F(y)  sin  ky  dy. 


(3.83) 


-77 


fix) 


+  E 


k=  1  L 


ak  cos 


£ 


+  bk  sin 


k  7TX 

~~r 


(3.84) 


is  the  Fourier  series  of  f{x)  on  the  interval  [  —  £,£].  The  Fourier  coefficients  ak,bk  can, 
in  fact,  be  computed  directly  without  appealing  to  the  rescaling.  Indeed,  replacing  the 
integration  variable  in  (3.83)  by  y  —  txx/ £,  and  noting  that  dy  =  (ft / £)  dx,  we  deduce  the 
rescaled  formulae 


ak  =  -£ 


'e  ft  s  knx 

j(x)  cos  — —  ax, 


bi  =  T 


ft  ^  •  k7TX  , 

j(x)  sin  ax, 


(3.85) 


■t  ^  ^  J-t 

for  the  Fourier  coefficients  of  f[x)  on  the  interval  [  —  £,£  . 

All  of  the  convergence  results,  integration  and  differentiation  formulae,  etc.,  that  are 
valid  for  the  interval  [  —  tt,  tt ]  carry  over,  essentially  unchanged,  to  Fourier  series  on  non¬ 
standard  intervals.  In  particular,  adapting  our  basic  convergence  Theorem  3.8,  we  conclude 
that  if  /(x)  is  piecewise  C1,  then  its  rescaled  Fourier  series  (3.84)  converges  to  its  2£  pe¬ 
riodic  extension  /(x),  subject  to  the  proviso  that  /(x)  takes  on  the  midpoint  values  at  all 
jump  discontinuities. 

Example  3.24.  Let  us  compute  the  Fourier  series  for  the  function  /(x)  =  x  on  the 
interval  —  1  <  x  <  1.  Since  /  is  odd,  only  the  sine  coefficients  will  be  nonzero.  We  have 


bk  = 


x  sin  knx  dx  = 


x  cos  kirx  smkitx 

-  + 


k  7T 


(k  7r): 


n  1 


-  x  —  —  1 


2(— 1) 


k+ 1 


k  7T 


The  resulting  Fourier  series  is 


2  (  sm2nx  sin37rx 

x  ~  —  Sill  71  x  —  -  +  - 

7T  V  2  3 

The  series  converges  to  the  2-periodic  extension  of  the  function  x,  namely 

x  —  2 77i,  2m—  1  <  x  <  2m+l, 

0,  x  =  m, 

which  is  plotted  in  Figure  3.10. 


fix) 


where  m  E  Z  is  an  arbitrary  integer, 
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Figure  3.10.  2-periodic  extension  of  x. 


We  can  similarly  reformulate  complex  Fourier  series  on  the  nonstandard  interval 
—  £,£].  Using  (3.82)  to  rescale  the  variables  in  (3.64),  we  obtain 


oo 


/(*) 


E 

k  =  — oo 


cke 


i  kirx/£ 


where 


ck  = 


—  /  f(x)e-'lkvx/edx. 

2  e  Li  } 


(3.86) 


Again,  this  is  merely  an  alternative  way  of  writing  the  real  Fourier  series  (3.84). 

When  dealing  with  a  more  general  interval  [a,  6],  there  are  two  possible  options.  The 
first  is  to  take  a  function  f{x)  defined  for  a  <  x  <  b  and  periodically  extend  it  to  a  function 

f(x)  that  agrees  with  f(x)  on  [a,  b]  and  has  period  b  —  a.  One  can  then  compute  the  Fourier 
series  (3.84)  for  its  periodic  extension  f(x)  on  the  symmetric  interval  [  —  £,£]  of  width 
2£  =  b  —  a;  the  resulting  Fourier  series  will  (under  the  appropriate  hypotheses)  converge 
to  f(x)  and  hence  agree  with  f(x)  on  the  original  interval.  An  alternative  approach  is  to 
translate  the  interval  by  an  amount  |  (a  +  b)  so  as  to  make  it  symmetric  around  the  origin; 
this  is  accomplished  by  the  change  of  variables  x  =  x  —  |(a  +  6),  followed  by  an  additional 


rescaling  to  convert  the  interval  into 
and  details  are  left  to  the  reader. 


—  7T,  7T 


The  two  methods  are  essentially  equivalent, 


Exercises 


3.4.1.  Let  f(x)  =  x 2  for  0  <  x  <  1.  Find  its  (a)  Fourier  sine  series;  (b)  Fourier  cosine  series. 

3.4.2.  Find  the  Fourier  sine  series  and  the  Fourier  cosine  series  of  the  following  functions  de¬ 
fined  on  the  interval  [0, 1];  then  graph  the  function  to  which  the  series  converges: 

Q 

(a)  1,  (b)  sin7r£,  (c)  sin  7 tx,  (d)  x(l  —  x). 

3.4.3.  Find  the  Fourier  series  for  the  following  functions  on  the  indicated  intervals,  and  graph 
the  function  that  the  Fourier  series  converges  to. 

(a)  |  x  |,  —3  <  x  <  3,  (b)  x 2  —  4,  —2  <  x  <  2,  (c)  ex,  —10  <  x  <  10, 

(d)  sinx,  —  1  <  x  <  1,  (e)  cr(x),  —2  <  x  <  2. 

3.4.4.  For  each  of  the  functions  in  Exercise  3.4.3,  write  out  the  differentiated  Fourier  series,  and 
determine  whether  it  converges  to  the  derivative  of  the  original  function. 

3.4.5.  Find  the  Fourier  series  for  the  integral  of  each  of  the  functions  in  Exercise  3.4.3. 

0  3.4.6.  Write  down  formulas  for  the  Fourier  series  of  both  even  and  odd  functions  on  [  —  £,£]. 
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3.4.7.  Let  f(x)  be  a  continuous  function  on  [0,£]. 

(a)  Under  what  conditions  is  its  odd  2 -^-periodic  extension  also  continuous? 

(b)  Under  what  conditions  is  its  odd  extension  also  continuously  differentiable? 

3.4.8.  (a)  Write  down  the  formulae  for  the  Fourier  series  for  a  function  f(x)  defined  on  the  in¬ 
terval  0  <  x  <  2tt.  (b)  Use  your  formula  in  the  case  f(x)  =  x.  Is  the  result  the  same  as 
(3.37)?  Explain,  and,  if  different,  discuss  the  connection  between  the  two  Fourier  series. 

3.4.9.  Find  the  Fourier  series  for  the  function  f(x)  =  x  on  the  interval  1  <  x  <  2  using  the  two 
different  methods  described  in  the  last  paragraph  of  this  subsection.  Are  your  Fourier  series 
the  same?  Explain.  Graph  the  functions  that  the  Fourier  series  converge  to. 

3.4.10.  Answer  Exercise  3.4.9  when  f(x)  =  sinx  on  the  interval  tt  <  x  <  2tt. 


3.5  Convergence  of  Fourier  Series 


The  goal  of  this  final  section  is  to  establish  some  of  the  most  basic  convergence  results  for 
Fourier  series.  This  is  not  a  purely  theoretical  enterprise,  since  convergence  considerations 
impinge  directly  upon  applications.  One  particularly  important  consequence  is  the  connec¬ 
tion  between  the  degree  of  smoothness  of  a  function  and  the  decay  rate  of  its  high-order 
Fourier  coefficients  —  a  result  that  is  exploited  in  signal  and  image  denoising  and  in  the 
analytic  properties  of  solutions  to  partial  differential  equations. 

This  section  is  written  at  a  slightly  more  theoretically  sophisticated  level  than  what  you 
have  read  so  far.  However,  an  appreciation  of  the  full  scope,  and  limitations,  of  Fourier 
analysis  requires  some  familiarity  with  the  underlying  theory.  Moreover,  the  required 
techniques  and  proofs  serve  as  an  excellent  introduction  to  some  of  the  most  important 
tools  of  modern  mathematical  analysis,  and  the  effort  you  expend  to  assimilate  this  material 
will  be  more  than  amply  rewarded  in  both  this  book  and  your  subsequent  mathematical 
studies,  be  they  applied  or  pure. 

Unlike  power  series,  which  converge  to  analytic  functions  on  the  interval  of  conver¬ 
gence,  and  diverge  elsewhere  (the  only  tricky  point  being  whether  or  not  the  series  converges 
at  the  endpoints),  the  convergence  of  a  Fourier  series  is  a  much  subtler  matter,  and  still 
not  completely  understood.  A  large  part  of  the  difficulty  stems  from  the  intricacies  of 
convergence  in  infinite-dimensional  function  spaces.  Let  us  therefore  begin  with  a  brief 
outline  of  the  key  issues. 


We  assume  that  you  are  familiar  with  the  usual  calculus  definition  of  the  limit  of  a 


sequence  of  real  numbers:  lim 


a 


n 


oo 


n 


=  a 


In  any  finite-dimensional  vector  space,  e.g., 


Mm,  there  is  essentially  only  one  way  for  a  sequence  of  vectors  v^2\  . . .  G  Mm  to 

converge,  as  guaranteed  by  any  one  of  the  following  equivalent  criteria: 


•  The  vectors  converge:  v*  E  Mm  as  n  ^  oo. 

•  The  individual  components  of  =  (v[n\  . . . ,  vffl)  converge,  so 

all  j  =  1 


lim 

n  — >  oo 


=  Vj  for 


•  The  norm  of  the  difference  goes  to  zero: 


-A  0  as  n  -A  oo. 


The  last  requirement,  known  as  convergence  in  norm ,  does  not,  in  fact,  depend  on  which 
norm  is  chosen.  Indeed,  on  a  finite-dimensional  vector  space,  all  norms  are  essentially 
equivalent,  and  if  one  norm  goes  to  zero,  so  does  any  other  norm,  [89;  Theorem  3.17]. 
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On  the  other  hand,  the  analogous  convergence  criteria  are  certainly  not  the  same  in 
infinite-dimensional  spaces.  There  is,  in  fact,  a  bewildering  variety  of  convergence  mecha¬ 
nisms  in  function  space,  including  pointwise  convergence,  uniform  convergence,  convergence 
in  norm,  weak  convergence,  and  so  on.  Each  plays  a  significant  role  in  advanced  mathe¬ 
matical  analysis,  and  hence  all  are  deserving  of  study.  Here,  though,  we  shall  cover  just 
the  most  basic  aspects  of  convergence  of  the  Fourier  series  and  their  applications  to  partial 
differential  equations,  leaving  the  complete  development  to  a  more  specialized  text,  e.g., 
37,  128]. 


Pointwise  and  Uniform  Convergence 


The  most  familiar  convergence  mechanism  for  a  sequence  of  functions  vn(x)  is  pointwise 
convergence.  This  requires  that  the  functions’  values  at  each  individual  point  converge  in 
the  usual  sense: 

lim  v  (x)  =  v*(x)  for  all  x  E  /,  (3.87) 

n  — oo 

where  IcR  denotes  an  interval  contained  in  their  common  domain.  Even  more  explicitly, 
pointwise  convergence  requires  that,  for  every  e  >  0  and  every  x  E  /,  there  exist  an  integer 
TV,  depending  on  e  and  x,  such  that 


vn(x)  —  v*{x)  \  <  £  for  all  n  >  N. 


(3.88) 


Pointwise  convergence  can  be  viewed  as  the  function  space  version  of  the  convergence  of  the 
components  of  a  vector.  We  have  already  stated  the  Fundamental  Theorem  3.8  regarding 
pointwise  convergence  of  Fourier  series;  the  proof  will  be  deferred  until  the  end  of  this 
section. 

On  the  other  hand,  establishing  uniform  convergence  of  a  Fourier  series  is  not  so 
difficult,  and  so  we  will  begin  there.  The  basic  definition  of  uniform  convergence  looks  very 
similar  to  that  of  pointwise  convergence,  with  a  subtle,  but  important,  difference. 


Definition  3.25.  A  sequence  of  functions  vn(x)  is  said  to  converge  uniformly  to  a 
function  v*(x)  on  a  subset  I  C  M  if,  for  every  £  >  0,  there  exists  an  integer  TV,  depending 
solely  on  £,  such  that 


vn(x)  —  v+(x)  |  <  £  for  all  x  E  /  and  all  n  >  TV. 


(3.89) 


Clearly,  a  uniformly  convergent  sequence  of  functions  converges  pointwise,  but  the 
converse  does  not  hold.  The  key  difference  —  and  the  reason  for  the  term  “uniform 
convergence”  —  is  that  the  integer  TV  depends  only  on  £  and  not  on  the  point  x  E  I. 
According  to  (3.89),  the  sequence  converges  uniformly  if  and  only  if  for  every  small  £,  the 
graphs  of  the  functions  eventually  lie  inside  a  band  of  width  2£  centered  on  the  graph  of 
the  limiting  function,  as  in  the  first  plot  in  Figure  3. IF  The  Gibbs  phenomenon  shown 
in  Figure  3.7  is  a  prototypical  example  of  nonuniform  convergence:  For  a  given  £  >  0,  the 
closer  x  is  to  the  discontinuity,  the  larger  n  must  be  chosen  so  that  the  inequality  in  (3.89) 
holds.  Hence,  there  is  no  uniform  choice  of  TV  that  makes  the  inequality  (3.89)  valid  for 
all  x  and  all  n  >  N. 

A  key  feature  of  uniform  convergence  is  that  it  preserves  continuity. 

Theorem  3.26.  If  each  vn(x)  is  continuous  and  vn(x)  — ?►  v^(x)  converges  uniformly , 
then  vir(x)  is  also  a  continuous  function. 
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Figure  3.11.  Uniform  and  nonuniform  convergence  of  functions. 


The  proof  is  by  contradiction.  Intuitively,  if  v^{x)  were  to  have  a  discontinuity,  then,  as 
sketched  in  the  second  plot  in  Figure  3.11,  a  sufficiently  small  band  around  its  graph  would 
not  connect  together,  and  this  prevents  the  connected  graph  of  any  continuous  function, 
such  as  vn(x),  from  remaining  entirely  within  the  band.  A  detailed  discussion  of  these 
issues,  including  the  proofs  of  the  basic  theorems,  can  be  found  in  any  introductory  real 
analysis  text,  [8,  96,  97]. 


Warning :  A  sequence  of  continuous  functions  can  converge  nonuniformly  to  a  contin¬ 
uous  function.  For  example,  the  sequence 

2  nx 

VnVX)  =  - 9-9 

n  1  +  n2x2 

converges  pointwise  to  v^(x)  =  0  (why?)  but  not  uniformly,  since 


max 


%»(*) 


which  implies  that  (3.89)  cannot  hold  when  e  <  1. 


The  convergence  (pointwise,  uniform,  etc.)  of  an  infinite  series  Y^kLiuk(x)  ^  by 
definition,  dictated  by  the  convergence  of  its  sequence  of  partial  sums 


Vn(X )  =  Y  Uk(X )•  (3'9°) 

k=  1 


The  most  useful  test  for  uniform  convergence  of  series  of  functions  is  known  as  the  Weier- 
strass  M -test,  in  honor  of  the  nineteenth  century  German  mathematician  Karl  Weierstrass, 
known  as  the  “father  of  modern  analysis” . 

Theorem  3.27.  Let  I  C  M.  Suppose  that ,  for  each  k  =  1,2,3, ...  ,  the  function 
uk(x)  is  bounded : 


uk(oc )  |  <  mk  for  all  x  E  /, 


where  mk  >  0  is  a  nonnegative  constant.  If  the  constant  series 


(3.91) 


00 

^  rnk  <  00 
k=  1 


(3.92) 
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converges,  then  the  function  series 


oo 

Y  uk(x)  =  f(x )  (3-93) 

k  =  i 

converges  uniformly  and  absolutely t  to  a  function  f(x)  for  all  x  E  /.  In  particular,  if  the 
summands  uk(x )  are  continuous,  so  is  the  sum  f(x). 

Warning :  Failure  of  the  M-test  strongly  indicates,  but  does  not  necessarily  preclude, 
that  a  pointwise  convergent  series  does  not  converge  uniformly. 

With  some  care,  we  can  manipulate  uniformly  convergent  series  just  like  finite  sums. 
Thus,  if  (3.93)  is  a  uniformly  convergent  series,  so  is  its  term-wise  product 


oo 


Y  9(x)uk(x)  =  g{x)f(x) 


(3.94) 


k=  1 


with  any  bounded  function:  |  g{x)  \  <  C  for  all  x  £  /.  We  can  integrate  a  uniformly 
convergent  series  term  by  term,^  and  the  resulting  integrated  series 


a 


oo 


E 

k=  1 


U 


k 


)  °°  rX  nX 

dy=Y  uk(y)dy=  f(y)dv 

k=  1  ^ a  ^ a 


(3.95) 


is  uniformly  convergent.  Differentiation  is  also  allowed  —  but  only  when  the  differentiated 
series  converges  uniformly. 


oo 


Proposition  3.28.  Suppose  the  series  uk(x)  =  f(x )  converges  pointwise.  If  the 


oo 


k=  1 


differentiated  series  uk  (x)  =  g(x)  is  uniformly  convergent,  then  the  original  series  is 


k=  1 


also  uniformly  convergent,  and,  moreover,  f'[x )  =  g[x). 


We  are  particularly  interested  in  the  convergence  of  a  Fourier  series,  which,  to  facilitate 
the  exposition,  we  take  in  its  complex  form 


Since  x  is  real, 


g  i  kx 


oo 

/(*)  ~  ^  ck  eikx . 

k  =  — oo 

<  l,  and  hence  the  individual  summands  are  bounded  by 


(3.96) 


cke 


i  kx 


< 


C 


k 


for  all  x. 


Applying  the  Weierstrass  M-test,  we  immediately  deduce  the  basic  result  on  uniform 
convergence  of  Fourier  series. 


t 

t 


oo  oo 

Recall  that  a  series  X  an  =  a*  is  said  to  converge  absolutely  if  X 

n = 1  n= 1 


converges. 


Assuming  that  the  individual  functions  are  all  integrable. 
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Theorem  3.29.  If  the  Fourier  coefficients  ck  of  a  function  f(x)  satisfy 


E 

k  =  —  oo 


<  OO, 


(3.97) 


then  the  Fourier  series  (3.96)  converges  uniformly  to  a  continuous  function  f(x)  that  has 
the  same  Fourier  coefficients:  ck  =  (f,elkx)  =  (f,elkx). 


Proof:  Uniform  convergence  and  continuity  of  the  limiting  function  follow  from  Theo¬ 
rem  3.27.  To  show  that  the  ck  actually  are  the  Fourier  coefficients  of  the  sum,  we  multiply 
the  Fourier  series  by  e~lkx  and  integrate  term  by  term  from  —  n  to  7 r.  As  in  (3.94,95), 
both  operations  are  valid  thanks  to  the  uniform  convergence  of  the  series.  Q.E.D. 


Remark:  As  with  the  Weierstrass  test,  failure  of  condition  (3.97)  strongly  indicates 
that  the  Fourier  series  does  not  converge  uniformly,  but  does  not  completely  rule  it  out; 
nor  does  it  say  anything  about  nonuniform  convergence  or  lack  thereof. 


The  one  thing  that  Theorem  3.29  does  not  guarantee  is  that  the  original  function  f[x) 

used  to  compute  the  Fourier  coefficients  ck  is  the  same  as  the  function  f{x)  obtained  by 
summing  the  resulting  Fourier  series!  Indeed,  this  may  very  well  not  be  the  case.  As  we 
know,  the  function  that  the  series  converges  to  is  necessarily  27r-periodic.  Thus,  at  the  very 
least,  f(x)  will  be  the  2n  periodic  extension  of  f(x).  But  even  this  may  not  suffice.  Indeed, 
two  functions  f(x)  and  f(x)  that  have  the  same  values  except  at  a  finite  set  of  points 
x1,...,xm  have  the  same  Fourier  coefficients.  (Why?)  For  example,  the  discontinuous 

(  1,  x  =  0, 

function  f(x)  =  <  has  all  zero  Fourier  coefficients,  and  hence  its  Fourier 

\  0,  otherwise, 

series  converges  to  the  continuous  zero  function.  More  generally,  two  functions  that  agree 
everywhere  outside  a  set  of  “measure  zero”  will  have  identical  Fourier  coefficients.  In  this 
way,  a  convergent  Fourier  series  singles  out  a  distinguished  representative  from  a  collection 
of  essentially  equivalent  27r-periodic  functions. 


Remark:  The  term  “measure”  refers  to  a  rigorous  generalization  of  the  notion  of  the 
length  of  an  interval  to  more  general  subsets  S  C  M.  In  particular,  S  has  measure  zero  if 
it  can  be  covered  by  a  collection  of  intervals  of  arbitrarily  small  total  length.  For  example, 
any  set  consisting  of  finitely  many  points,  or  even  countably  many  points,  e.g.,  the  rational 
numbers,  has  measure  zero;  see  Exercise  3.5.19.  The  proper  development  of  the  notion  of 
measure,  and  the  consequential  Lebesgue  theory  of  integration,  is  properly  studied  in  a 
course  in  real  analysis,  [96,98]. 

As  a  consequence  of  Theorem  3.26,  a  Fourier  series  cannot  converge  uniformly  when 
discontinuities  are  present.  However,  it  can  be  proved,  [128],  that  even  when  the  function 
is  not  everywhere  continuous,  its  Fourier  series  is  uniformly  convergent  on  any  closed  subset 
of  continuity. 

Theorem  3.30.  Let  f(x)  be  2  7r -periodic  and  piecewise  C1.  If  f{x)  is  continuous  on 
the  open  interval  a  <  x  <  b,  then  its  Fourier  series  converges  uniformly  to  f(x)  on  any 
closed  subinterval  a  +  5  <  x  <  b  —  5  for  0  <  5  <  |  (6  —  a) . 


For  example,  the  Fourier  series  (3.49)  for  the  unit  step  function  converges  uniformly 
if  we  stay  away  from  the  discontinuities  —  for  instance,  by  restriction  to  a  subinterval  of 
the  form  [5,  tt  —  5]  or  [  —  7r  +  5,  —  5]  for  any  0  <  S  <  |tt.  This  reconfirms  our  observation 
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that  the  nonuniform  Gibbs  behavior  becomes  progressively  more  and  more  localized  at  the 
discontinuities. 


Exercises 


3.5.1.  Consider  the  following  sequence  of  planar  vectors  f  1  —  ^  ,  e  n  ^ ,  n  —  1,  2,  3, . . . 

Prove  that  converges  to  v*  =  ( 1,0)  as  n  — )►  oo  by  showing  that:  (a)  the  individual 

components  converge;  (b)  the  Euclidean  norms  converge:  || 


v'  ||2  — 0. 

3.5.2.  Which  of  the  following  sequences  of  vectors  converge  as  n  — >>  oo?  What  is  the  limit? 

/  \  |  1  n<2  \  f  \  (  \  (  cos  n  sin  n\  ,  n  (  1.1 

0)  I  ,  r?  »  ,  TTT?  )>  (b)  ( COS  n,  sinn),  (c)  (  — ,  — —  ),  (d)  (cos-,  sin 


(e) 


(h) 

(j) 


1  +  n2  5  1  +  2n2 


1  11.1 
—  cos  —  ,  —  sm  — 
n  n  5  n  n 


1  —  n  1  —  n  1 


n 


n 


n 


n 


(f) 

2 


e-n,  n e_n,  n2e"n 


5 


(g) 


O  1 

logn  (logn)  (logn)' 


n 


w 


l  +  n,l+n2,l  +  n: 


(i) 


i+i 

n 


"-(•-jr 


1  cos  n  —  1 


n 


n* 


(Jc)  (  n  (  e 


,1/n  —  1 ) ,  n2  (  cos  ^  —  1 


3.5.3.  Which  of  the  following  sequences  of  functions  converge  pointwise  for  x  G  R  as  n  — oo? 
What  is  the  limit?  (a)  1  —  (b)  e~nx ,  (c)  e~nx  ,  (d)  |x  — n|,  (e) 


(0 


1. 

2. 


x  <  n. 
x  >  n. 


(g) 


nx 

T?  —  <  X  <  — 
a  ,  n  ^  x  ^  n  ’ 

0,  otherwise, 


1  +  (x  —  n)2 


(h) 


r  x, 

X 

)  -2 

^  nx  , 

X 

<  n, 
>  n. 


3.5.4.  Prove  that  the  sequence  vn{x) 
formly,  to  the  zero  function. 


0  x  <^  — 

n  ’  converges  pointwise,  but  not  uni- 
0,  otherwise, 


3.5.5.  Which  of  the  following  sequences  of  functions  converge  pointwise  to  the  zero  function  for 
all  x  G  R?  Which  converge  uniformly? 

(d)  _  „  1  _9,  ,  (e)  1 


(a) 

X 

2  5 
nz 

(b) 

—  n  |  x 

-  (c) 

xe  71 

1  x  1 

5 

(f) 

|  x  —  n 

^  (&) 

r  i 

n  ’ 

0  <  x  | 

<  n, 

(b)  { 

l  0, 

otherwise, 

n  (1  +  x2) 


n,  0  <  |*|  <  y  f  z/n, 

0,  otherwise,  \  1  /(nx) 


1  +  (x  —  n)2 

x  |  <  1, 

X  >  1. 


3.5.6.  Does  the  sequence  vn(x)  =  nxe 
Does  it  converge  uniformly? 


—  nx 


converge  pointwise  to  the  zero  function  for  x  G  R? 


3.5.7.  Answer  Exercise  3.5.6  when  (a)  vn(x)  =  xe 

1,  n  <  x  <  n  +  1/n, 

0,  otherwise, 


(c)  v„(x)  =  <  ’  .  .  7  ’  (d)  uAx) 

w  nW  \  0,  otherwise,  v  7  nW 

/  \  /  \  f  l/>/n,  n  <  x  <  2n,  /  \ 

(e)  ”»(*)  =  (  0  otherwise,  <f)  = 


nx2  / 7  \  ,  x  I  1,  n  <  x  <  n  +  1. 

(b)  =  i  q,  otherwise, 

1/n,  n  <  x  <  2  n, 

nv~7  1  0,  otherwise, 

2  2 
n  x 


1. 


0, 


—  1/n  <  x  <  1/n, 
otherwise. 


3.5.8.  (a)  What  is  the  limit  of  the  functions  vn(pc)  =  tan-  nx  as  n  n  oo?  (b)  Is  the  conver¬ 
gence  uniform  on  all  of  R?  (c)  on  the  interval  [  —  1, 1]?  (d)  on  the  subset  {x  >  1 }? 

3.5.9.  True  or  false:  If  pn(x)  is  a  sequence  of  polynomials  that  converge  pointwise  to  a  polyno¬ 
mial  p^(x),  then  the  convergence  is  uniform. 
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3.5.10.  Suppose  vn(x)  are  continuous  functions  such  that  vn  — v *  pointwise  on  all  of  R. 


v 


True  or  false:  (a)  vn  —  v*  — 0  pointwise;  (b)  if  v^(pc)  ^  0  for  all  x,  then  —  — 1  pointwise. 

★ 

3.5.11.  Which  of  the  following  series  satisfy  the  M- test  and  hence  converge  uniformly  on  the 

OO  L  rp  CXD  qiti  h  t  °°  7  OO 

interval  [0,1]?  (a)  £  ,  (b)  £  (c)  E  (d)  E  (*/2)fe , 

fc=l  ^  fc=l  ^  fc=l  fc=l 

oo  p fc .X  oo  —kx  oo  px/k  _  ■ 

(e)  E  V’  (f)  ^  V’  (g)  ^  f"lb— 

fc=l  ^  fc=l  ^  fc=l  ^ 


OO 


X 


k 


3.5.12.  Prove  that  the  power  series  -  /7 

fe= i  + 1) 


converges  uniformly  for  —1<x<1. 


3.5.13.  (a)  Prove  the  following  result:  Suppose  |  g(x)  |  <  M  for  all  x  E  /.  If  (3.93)  is  a  uni¬ 
formly  convergent  series  on  /,  so  is  the  term- wise  product  (3.94). 

( b )  Find  a  counterexample  when  g(x)  is  not  uniformly  bounded. 


oo 


0  3.5.14.  Suppose  each  uk(x)  is  continuous,  and  the  series  E  uk(x)  —  f(x)  converges  uniformly 

k  =  l 

on  the  bounded  interval  a  <  x  <  b.  Prove  that  the  integrated  series  (3.95)  is  uniformly 
convergent. 


oo 


§  3.5.15.  Prove  that  if  E  Tl  +  bl<  oo,  then  the  real  Fourier  series  (3.34)  converges  uniformly 

k  =  l 

to  a  continuous  27r-periodic  function. 


oo 


oo 


3.5.16.  Suppose  I  ak  I  <  00  and  I  ^k  I  <  00 •  Does  the  conclusion  of  Exercise  3.5.15  still 


hold? 


k=  1 


k=  1 


3.5.17.  Explain  why  you  only  need  check  the  inequalities  (3.91)  for  all  sufficiently  large  k  0 
in  order  to  use  the  Weierstrass  M- test. 

3.5.18.  Suppose  we  say  that  a  sequence  of  vectors  E  Rm  converges  uniformly  to  v*  E  Rm 

(k)  .  - 

if,  for  every  6  >  0,  there  is  an  AT,  depending  only  on  e,  such  that  |  J  —  vi  \  <  e,  for  all 
k  >  N  and  all  i  =  1, . . . ,  m.  Prove  that  even/  convergent  sequence  of  vectors  converges 
uniformly. 

0  3.5.19.  (a)  Let  5  =  {x1,x2,x3, ...  }  Clbea  countable  set.  Prove  that  S  has  measure  zero  by 
showing  that,  for  every  e  >  0,  there  exists  a  collection  of  open  intervals  /l5  J2,  /3, . . .  C  R, 
with  respective  lengths  f2,  f3, . . .  ,  such  that  S  C  (J  7  ■  5  while  the  total  length  =  £. 

(b)  Explain  why  the  set  of  rational  numbers  Q  C  R  is  dense  but  nevertheless  has  measure 
zero. 


Smoothness  and  Decay 


The  criterion  (3.97),  which  guarantees  uniform  convergence  of  a  Fourier  series,  requires, 
at  the  very  least,  that  the  Fourier  coefficients  go  to  zero:  ck  — >  0  as  k  ±oo.  And  they 
cannot  decay  too  slowly.  For  example,  the  individual  summands  of  the  infinite  series 


oo 


0  ^k——  oo 


(3.98) 


go  to  0  as  k  — >  oo  whenever  a  >  0,  but  the  series  converges  only  when  a  >  1.  (This  is  an 
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immediate  consequence  of  the  standard  integral  convergence  test, 
we  can  bound  the  Fourier  coefficients  by 


8,97, 108].) 


Thus,  if 


c 


k 


< 


M 


k 


OL 


for  all 


k  »  0, 


(3.99) 


for  some  exponent  a  >  1  and  some  positive  constant  M  >  0,  then  the  Weierstrass  M-test 
will  guarantee  that  the  Fourier  series  converges  uniformly  to  a  continuous  function. 

An  important  consequence  of  the  differentiation  formula  (3.79)  for  Fourier  series  is  that 
one  can  detect  the  degree  of  smoothness  of  a  function  by  seeing  how  rapidly  its  Fourier 
coefficients  decay  to  zero.  More  rigorously: 


Theorem  3.31. 


Let  0  <  n  £  Z.  If  the  Fourier  coefficients  of  f(x)  satisfy 


E 

k  —  —  oo 


k 


n 


C 


k 


<  OO, 


(3.100) 


then  the  Fourier  series  (3.64)  converges  uniformly  to  an  n-times  continuously  differentiable 
function  f(pc)  E  Cn,  which  is  the  2  7 v-periodic  extension  of  f(x).  Furthermore ,  for  any  0  < 
m  <  n,  them-times  differentiated  Fourier  series  converges  uniformly  to  the  corresponding 
derivative  f(m\x). 


Proof :  Iterating  (3.79),  the  Fourier  series  for  the  nth  derivative  of  a  function  is 


oo 

f{n\x)  ~  J2  i nknckeikx.  (3.101) 

k  =  — oo 

If  (3.100)  holds,  the  Weierstrass  M-test  implies  the  uniform  convergence  of  the  differenti¬ 
ated  series  (3.101)  to  a  continuous  27r-periodic  function.  Proposition  3.28  guarantees  that 
the  limit  is  the  nth  derivative  of  the  original  Fourier  series.  Q.E.D. 

This  result  enables  us  to  quantify  the  rule  of  thumb  that,  the  smaller  the  high- 
frequency  Fourier  coefficients,  the  smoother  the  function. 

Corollary  3.32.  If  the  Fourier  coefficients  satisfy  (3.99)  for  some  a  >  n  + 1,  then  the 
Fourier  series  converges  uniformly  to  an  n-times  continuously  differentiable  2n-periodic 
function. 


If  the  Fourier  coefficients  go  to  zero  faster  than  any  power  of  fc,  e.g.,  exponentially 
fast,  then  the  function  is  infinitely  differentiable.  Analyticity  is  more  delicate,  and  we  refer 
the  reader  to  [1281  for  details. 


Example  3.33.  The  2  7r-periodic  extension  of  the  function  |  x  \  is  continuous  with 
piecewise  continuous  first  derivative.  Its  Fourier  coefficients  (3.54)  satisfy  the  estimate 
(3.99)  for  a  =  2,  which  is  not  quite  fast  enough  to  ensure  a  continuous  second  derivative. 
On  the  other  hand,  the  Fourier  coefficients  (3.36)  of  the  step  function  a(x)  tend  to  zero  only 
as  1/|  k  |,  so  a  =  1,  reflecting  the  fact  that  its  periodic  extension  is  piecewise  continuous, 
but  not  continuous. 
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Exercises 


oo 


3.5.20.  (a)  Prove  that  the  complex  Fourier  series  f(x) 


El  j  _ 

^2  e  converges  uniformly  on 


k=  1 


the  interval  [  —  tt,  tt ] .  (b)  Is  the  sum  f(x)  continuous?  Why  or  why  not? 

(c)  Is  f(x)  continuously  differentiable?  Why  or  why  not? 

3.5.21.  First,  without  explicitly  evaluating  them,  how  fast  do  you  expect  the  Fourier  coeffi¬ 
cients  of  the  following  functions  to  go  to  zero  as  k  — >>  oo?  Then  prove  your  claim  by  eval¬ 
uating  the  coefficients. 


(a)  x 


7 r. 


(b) 


X 


(c)  xz ,  (d)  x^  —  27 t2x2,  (e)  sinz  x,  (f) 


sinx 


3.5.22.  Using  the  criteria  of  Theorem  3.31,  determine  how  many  continuous  derivatives  the 
functions  represented  by  the  following  Fourier  series  have: 


oo 


i  kx 


(a)  E 


k  =  — oo 


oo 


1  +  /c4 


i  kx 


(e)  E 

k  =  — oo 


k  ! 


oo  i  kx 

(b)  E 

k——  oo 
0 

oo 

(f)  (  1  —  COS 

k=  1 


oo 


k2  +  /c5 

1 


(c)  E  e 

k  =  — oo 

i  kx 


i  kx  —  k2 


oo 


i  kx 


(d)  E 


fc  Vo  k  +  1 


k2 


X  3.5.23.  Discuss  convergence  of  each  of  the  following  Fourier  series.  How  smooth  is  the  sum? 

Graph  the  partial  sums  to  obtain  a  reasonable  approximation  to  the  graph  of  the  summed 
series.  How  many  summands  are  needed  to  obtain  accuracy  in  the  second  decimal  digit  over 
the  entire  interval?  Point  out  discontinuities,  corners,  and  other  features  that  you  observe. 

—  k  ,  ,,  s  cos  kx  ,  s  U2,  sin  kx  ,  „  ^  sin  kx 


oo 

(a)  E  e 

k  =  0 


3.5.24.  Prove  that  if 


coskx,  (b)  E  t  ,  -i 

fc  =  0  K  +  -1 


(c)  £  fc3/2  ’ 


(■ d )  fc3+fc- 


afc  |,  |  bk  |  <  M  k~  for  some  M  >  0  and  a  >  n  +  1,  then  the  real  Fourier 
series  (3.34)  converges  uniformly  to  an  n-times  continuously  differentiable  27r-periodic 
function  /  E  Cn. 

3.5.25.  Give  a  simple  explanation  of  why,  if  the  Fourier  coefficients  ak  =  bk  =  0  for  all  suffi¬ 
ciently  large  k  0,  then  the  Fourier  series  converges  to  an  analytic  function. 


Hilbert  Space 


In  order  to  make  further  progress,  we  must  take  a  little  detour.  The  proper  setting  for 
the  rigorous  theory  of  Fourier  series  turns  out  to  be  the  most  important  function  space  in 
modern  analysis  and  modern  physics,  known  as  Hilbert  space  in  honor  of  the  great  late- 
nineteenth-/early- twentieth-century  German  mathematician  David  Hilbert.  The  precise 
definition  of  this  infinite-dimensional  inner  product  space  is  somewhat  technical,  but  a 
rough  version  goes  as  follows: 


Definition  3.34.  A  complex-valued  function  f(x)  is  called  square-integrable  on  the 


interval 


—  7T,  7T 


if  it  has  finite  L2  norm: 


f{x)  |2  dx  <  oo, 


(3.102) 


The  Hilbert  space  L2  =  L2 
square-integrable  functions. 


7T,  7T 


is  the  vector  space  consisting  of  all  complex-valued 
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The  triangle  inequality 


f  +  g  II  <  II  /II  +  \\g 


implies  that  if  /, g  G  L2,  so  ||  / 1|.  ||  g  ||  <  oo,  then  ||  /  +  g  ||  <  oo,  and  so  /  +  g  G  L 
Moreover,  for  any  complex  constant  c, 


cf  ||  = 


c 


f  ii, 


and  so  cf  G  L2  also.  Thus,  as  claimed,  Hilbert  space  is  a  complex  vector  space. 
Cauchy-Schwarz  inequality 

f  ,9)  I  <  II  /II  II  9 


The 


implies  that  the  L2  Hermitian  inner  product 


f  ,9)  = 


*7 T 


2tt 


f(x )  g(x)  dx 


(3.103) 


■7 T 


of  two  square-integrable  functions  is  well  defined  and  finite.  In  particular,  the  Fourier 
coefficients  of  a  function  /  £  L2  are  specified  by  its  inner  products 


*7T 


cfe  =  (  /  ,  e 


i  kx 


2tt 


f(x)  e 


—  i  kx 


dx 


-IT 


with  the  complex  exponentials  (which,  by  (3.63),  are  in  L2),  and  hence  are  all  well  defined 
and  finite. 

There  are  some  interesting  analytic  subtleties  that  arise  when  one  tries  to  prescribe 
precisely  which  functions  are  in  the  Hilbert  space.  Every  piecewise  continuous  function 
belongs  to  L2.  But  some  functions  with  singularities  are  also  members.  For  example,  the 
power  function  |  x  \ ~ a  belongs  to  L2  for  any  a  <  | ,  but  not  if  a  > 

Analysis  relies  on  limiting  procedures,  and  it  is  essential  that  Hilbert  space  be  “com¬ 
plete”  in  the  sense  that  appropriately  convergent^  sequences  of  functions  have  a  limit.  The 
completeness  requirement  is  not  elementary,  and  relies  on  the  development  of  the  more 
sophisticated  Lebesgue  theory  of  integration,  which  was  formalized  in  the  early  part  of 
the  twentieth  century  by  the  French  mathematician  Henri  Lebesgue.  Any  function  which 
is  square-integrable  in  the  Lebesgue  sense  is  admitted  into  L2.  This  includes  such  non- 

piecewise-continuous  functions  as  sin  —  and  x-1/3,  as  well  as  the  strange  function 


x 


r(x) 


1  if  x  is  a  rational  number, 
0  if  x  is  irrational. 


(3.104) 


Thus,  while  well  behaved  in  some  respects,  square-integrable  functions  can  be  quite  wild 
in  others. 


Remark :  The  completeness  of  Hilbert  space  can  be  viewed  as  the  infinite-dimensional 
analogue  of  the  completeness  of  the  real  line  M,  meaning  that  every  convergent  Cauchy 
sequence  of  real  numbers  has  a  limit  in  M.  On  the  other  hand,  the  rational  numbers  Q  are 
not  complete  —  since  a  convergent  sequence  of  rational  numbers  may  well  have  an  irrational 
limit  —  but  form  a  dense  subset  of  M,  because  every  real  number  can  be  arbitrarily  closely 


^  The  precise  technical  requirement  is  that  every  Cauchy  sequence  of  functions  vk  E  L2 
converges  to  a  function  v *  €  L2;  see  [37,  96,  98]  and  also  Exercise  3.5.42  for  details. 
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approximated  by  rational  numbers,  e.g.,  its  truncated  decimal  expansions.  Indeed,  a  fully 
rigorous  definition  of  the  real  numbers  M  is  somewhat  delicate,  [96,  97]. 

Similarly,  the  space  of  continuous  functions  C°[  —  tt,  tt]  is  not  complete,  in  that  (nonuni- 
formly)  convergent  sequences  of  continuous  functions  are  not,  in  general,  continuous,  but  it 

since  every  L2  function  can  be 


—  7T,  7T 


does  form  a  dense  subspace  of  the  Hilbert  space  L2 
arbitrarily  closely  approximated  (in  norm)  by  continuous  functions,  e.g.,  its  approximating 
trigonometric  polynomials.  Thus,  just  as  M  can  be  viewed  as  the  completion  of  Q  under 
the  Euclidean  norm,  so  Hilbert  space  can  be  viewed  as  the  completion  of  the  space  of  con¬ 
tinuous  functions  under  the  L2  norm,  and,  just  like  that  of  R,  its  fully  rigorous  definition 
is  rather  subtle. 

A  second  complication  is  that  (3.102)  does  not,  strictly  speaking,  define  a  norm  once 
we  allow  discontinuous  functions  into  the  fold.  For  example,  the  piecewise  continuous 
function 

f  1,  x  =  0, 

/o(*)=  n  ,n  (3-105) 

10,  x^0, 

has  norm  zero,  ||  f0  ||  =  0,  even  though  it  is  not  zero  everywhere.  Indeed,  any  function 
that  is  zero  except  on  a  set  of  measure  zero  also  has  norm  zero,  including  the  function 
(3.104).  Therefore,  in  order  to  make  (3.102)  into  a  legitimate  norm,  we  must  agree  to 
identify  any  two  functions  that  have  the  same  values  except  on  a  set  of  measure  zero. 
Thus,  the  zero  function  0  along  with  the  preceding  examples  (3.104)  and  (3.105)  are  all 
viewed  as  defining  the  same  element  of  Hilbert  space.  So,  an  element  of  Hilbert  space  is 
not,  in  fact,  a  function,  but,  rather,  an  equivalence  class  of  functions  all  differing  on  a  set 
of  measure  zero.  All  this  may  strike  the  applications-oriented  reader  as  becoming  much  too 
abstract  and  arcane.  In  practice,  you  will  not  lose  much  by  working  with  the  elements  of 
L2  as  if  they  were  ordinary  functions,  and,  even  better,  assuming  that  said  “functions”  are 
always  piecewise  continuous  and  square-integrable.  Nevertheless,  the  full  analytical  power 
of  Hilbert  space  theory  is  unleashed  only  by  including  completely  general  square-integrable 
functions. 

After  its  invention  by  pure  mathematicians  around  the  turn  of  the  twentieth  century, 
physicists  in  the  1920s  suddenly  realized  that  Hilbert  space  was  the  ideal  setting  for  the 
modern  theory  of  quantum  mechanics,  [66,  72, 115]  .  A  quantum-mechanical  wave  function 
is  an  element p  E  L2  that  has  unit  norm:  ||  p  ||  =  1.  Thus,  the  set  of  wave  functions  is 
merely  the  “unit  sphere”  in  Hilbert  space.  Quantum  mechanics  endows  each  physical 
wave  function  with  a  probabilistic  interpretation.  Suppose  the  wave  function  represents 
a  single  subatomic  particle  —  photon,  electron,  etc.  Then  the  squared  modulus  of  the 
wave  function,  |  p{x)  |2,  represents  the  probability  density  that  quantifies  the  chance  of  the 
particle  being  located  at  position  x.  More  precisely,  the  probability  that  the  particle  resides 

in  a  prescribed  interval  [a,  b]  C  [  —  tt,  tt ]  is  equal  to  J- —  f  \  p(x)  |2  dx  .  In  particular,  the 
wave  function  has  unit  norm.  V  77  da 


\  I —  [  \p(x)  2  dx  =1 

V  27F  J-* r 


(3.106) 


^  Here  we  are  acting  as  if  the  physical  universe  were  represented  by  the  one-dimensional  interval 
[  —  7r,7r].  The  more  apt  context  of  three-dimensional  physical  space  is  developed  analogously, 

o 

replacing  the  single  integral  by  a  triple  integral  over  all  of  R  .  See  also  Section  7.4. 
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because  the  particle  must  certainly,  i.e.,  with  probability  1,  be  somewherel 


Convergence  in  Norm 

We  are  now  in  a  position  to  discuss  convergence  in  norm  of  a  Fourier  series.  We  begin  with 
the  basic  definition,  which  makes  sense  on  any  normed  vector  space. 

Definition  3.35.  Let  V  be  a  normed  vector  space.  A  sequence  s2,  s3, . . .  £  V  is 
said  to  converge  in  norm  to  /  £  V  if  ||  sn  —  f  ||  — >>  0  as  n  — >>  oo. 

As  we  noted  earlier,  on  finite-dimensional  vector  spaces  such  as  Mm,  convergence  in 
norm  is  equivalent  to  ordinary  convergence.  On  the  other  hand,  on  infinite-dimensional 
function  spaces,  convergence  in  norm  differs  from  pointwise  convergence.  For  instance,  it 
is  possible  to  construct  a  sequence  of  functions  that  converges  in  norm  to  0,  but  does  not 
converge  pointwise  anywherel  (See  Exercise  3.5.43.) 

While  our  immediate  interest  is  in  the  convergence  of  the  Fourier  series  of  a  square- 


integrable  function  /  £ 


—  7T,  7T 


the  methods  we  develop  are  of  very  general  utility. 


Indeed,  in  later  chapters  we  will  require  the  analogous  convergence  results  for  other  types 
of  series  solutions  to  partial  differential  equations,  including  multiple  Fourier  series  as  well 
as  series  involving  Bessel  functions,  spherical  harmonics,  Laguerre  polynomials,  and  so  on. 
Since  it  distills  the  key  issues  down  to  their  essence,  the  general,  abstract  version  is,  in  fact, 
easier  to  digest,  and,  moreover,  will  be  immediately  applicable,  not  just  to  basic  Fourier 
series,  but  to  very  general  “eigenfunction  series” . 

Let  V  be  an  infinite-dimensional  inner  product  space,  e.g.,  L2  [  —  7r,  7r ] .  Suppose 

Fli  '  '  ' 


are  an  orthonormal  collection  of  elements  of  V,  meaning  that 


<Pj  'Vk  /  = 


(3.107) 


A  straightforward  argument 


1  j  =  k, 

0,  j  ±  k. 

see  Exercise  3.5.33  —  proves  that  the  ipk  are  linearly 


independent.  Given  /  £  V,  we  form  its  generalized  Fourier  series 


oo 


/  ~  E  CkVk 


where 


c 


k 


f><Pk) 


(3.108) 


k=  1 


The  formula  for  the  coefficient  ck  is  obtained  by  formally  taking  the  inner  product  of  the 
series  with  ipk  and  invoking  the  orthonormality  conditions  (3.107).  The  two  main  examples 
are  the  real  and  complex  L2  spaces: 

•  V  consists  of  real  square-integrable  functions  defined  on  [  —  7T,  tt ]  under  the  rescaled  L2 

i  r 

inner  product  (  /  , g)  =  —  /  f(x)g(x)dx.  The  orthonormal  system  {p> k  }  consists 


7T 


-7 r 


of  the  basic  trigonometric  functions,  numbered  as  follows: 


Fi  = 


V2 


(/?2  =  cosx,  (p3  =  smXj  (/?4  =  cos2x,  p>5  =  sm2x,  (/?6=cos3x, 


V  consists  of  complex  square-integrable  functions  defined  on  [  —  tt  ,  7r  ]  using  the  Hermi- 
tian  inner  product  (3.103).  The  orthonormal  system  {p>k}  consists  of  the  complex 
exponentials,  which  we  order  as  follows: 


¥>1  =  1,  V  2  =  e 


ia,  ¥>3  =  e_1X) 


=  e2lx, 


<P5  =  e-2lx, 


‘ Pe  =  e3ix , 
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In  each  case,  the  generalized  Fourier  series  (3.108)  reduces  to  the  ordinary  Fourier  se¬ 
ries,  with  a  minor  change  of  indexing.  Later,  when  we  extend  the  separation  of  variables 
technique  to  partial  differential  equations  in  more  than  one  space  dimension,  we  will  en¬ 
counter  a  variety  of  other  important  examples,  in  which  the  <pk  are  the  eigenfunctions  of  a 
self-adjoint  linear  operator. 

For  the  remainder  of  this  section,  to  streamline  the  ensuing  proofs,  we  will  henceforth 
assume  that  V  is  a  real  inner  product  space.  However,  all  results  will  be  formulated  so 
they  are  also  valid  for  complex  inner  product  spaces;  the  slightly  more  complicated  proofs 
in  the  complex  case  are  relegated  to  the  exercises. 

By  definition,  the  generalized  Fourier  series  (3.108)  converges  in  norm  to  /  if  the 
sequence  provided  by  its  partial  sums 


n 

sn  =  J2°kiPk  (3-109) 

k=  1 

satisfies  the  criterion  of  Definition  3.35.  Our  first  result  states  that  the  partial  Fourier 
sum  (3.109),  with  ck  given  by  the  inner  product  formula  in  (3.108),  is,  in  fact,  the  best 
approximation  to  /  E  V  in  the  least  squares  sense,  [89]. 


Theorem  3.36.  Let  Vn  =  span  {yq,  (/?2, . . . ,  (pn}  C  V  be  the  n- dimensional  subspace 
spanned  by  the  hrst  n  elements  of  the  orthonormal  system.  Then  the  nth  order  Fourier 
partial  sum  sn  E  Vn  is  the  best  least  squares  approximation  to  f  that  belongs  to  the 
subspace ,  meaning  that  it  minimizes  ||  /  —  p  among  all  possible  p  E  V  . 


Proof :  Given  any  element 


n 


p 


n 


T  <4  Vk  e  rn, 


k  =  1 


we  have,  in  view  of  the  orthonormality  relations  (3.107), 


Pn  =\Pn,P 


n  i  r  n 
n 


n 


n 


n 


Y  dj ?! ’  Y  dk Vk, )  =  Y  dj dk(vj,<pk)  =  Y  i d 

j  =  1  k=  1  /  j,k=  1 


k 


(3.110) 


k=  1 


reproducing  the  formula  (B.27)  for  the  norm  with  respect  to  an  orthonormal  basis.  There¬ 
fore,  by  the  symmetry  property  of  the  real  inner  product, 


/  ~Pn 


=  if  ~PnJ  ~Pn)  =  II  /II2  -2(/,P„)  +  |b 

2 


n 


n 


n 


n 


f\\2~2  Y  dk(rPk)  +  \\pn 


f  II2  -  2  Y  Ckdk  +  Y  I  dk 


k=  1 


k=  1 


k=  1 


n 


n 


f  ii2  -  Y  i  ck  i2  +  Y  i  ck~dk 


k=  1 


k=  1 


The  final  equality  results  from  adding  and  subtracting  the  squared  norm  of  the  partial  sum 
(3.109), 


n 


n 


E 

k=  1 


C 


k 


(3.111) 
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which  is  a  particular  case  of  (3.110).  We  conclude  that 


n 


f-Pn 


f  IP- 


n 


+e 

k=  1 


ck  ^k 


(3.112) 


The  hrst  and  second  terms  on  the  right-hand  side  of  (3.112)  are  uniquely  determined  by 
/  and  hence  cannot  be  altered  by  the  choice  of  pn  E  Vnl  which  affects  only  the  final 
summation.  Since  the  latter  is  a  sum  of  nonnegative  quantities,  it  is  clearly  minimized  by 
setting  all  its  summands  to  zero,  i.e.,  setting  dk  =  ck  for  all  k  =  1, . . .  ,n.  We  conclude 
that  ||  /  —  pn  ||  achieves  its  minimum  value  among  all  pn  E  Vn  if  and  only  if  dk  =  ckl  which 
implies  that  pn  =  sn  is  the  Fourier  partial  sum  (3.109).  Q.E.D. 

Example  3.37.  Consider  the  ordinary  real  Fourier  series.  The  subspace  T ^  C  L2 
spanned  by  the  trigonometric  functions  cos/cx,  sin  kx,  for  0  <  k  <  n,  consists  of  all 
trigonometric  polynomials  (finite  Fourier  sums)  of  degree  <  n : 


n 

p  ^ 

—  +  y,  [  rk  cos  kx  -\-  sk  sin  k  x 
k=l 


(3.113) 


Theorem  3.36  implies  that  the  nth  Fourier  partial  sum  (3.38)  is  distinguished  as  the  one 
that  best  approximates  f(x )  in  the  least  squares  sense,  meaning  that  it  minimizes  the  L2 
norm  of  the  difference, 


II  f  ~Pn 

=  \  ~  f  1  f(x)~Pn(X) 

2  dx 

V  n  J-TT 

among  all  such  trigonometric  polynomials  (3.113). 


(3.114) 


Returning  to  the  general  framework,  if  we  set  pn  =  sn,  so  dk  =  cfc,  in  (3.112),  we 
conclude  that  the  minimizing  least  squares  error  for  the  Fourier  partial  sum  is 


0  <  ||/- 


n 


k=  1 


(3.115) 


We  conclude  that  the  general  Fourier  coefficients  of  the  function  /  must  satisfy  the  in¬ 
equality 

n 


E 


(3.116) 


Let  us  see  what  happens  in  the  limit  as  n  oo.  Since  we  are  summing  a  sequence  of 
nonnegative  numbers,  with  uniformly  bounded  partial  sums,  the  limiting  summation  must 
exist  and  be  subject  to  the  same  bound.  We  have  thus  established  BesseVs  inequality ,  a 
key  step  on  the  road  to  the  general  theory. 


Theorem  3.38. 

is  bounded  by 


The  sum  of  the  squares  of  the  general  Fourier  coefficients  of  f  E  V 


E 


k=  1 


(3.117) 


Now,  if  a  series,  such  as  that  on  the  left-hand  side  of  Bessel’s  inequality  (3.117),  is  to 
converge,  the  individual  summands  must  go  to  zero.  Thus,  we  immediately  deduce: 
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Corollary  3.39.  The  general  Fourier  coefficients  of  f  E  V  satisfy  ck  — ?►  0  as  k  — oo. 


In  the  case  of  the  trigonometric  Fourier  series,  Corollary  3.39  yields  the  following 
simplified  form  of  what  is  known  as  the  Riemann-Lebesgue  Lemma. 


Lemma  3.40.  If  f  E  L2[  — 7r,7r 
satisfy 


is  square-integrable ,  then  its  Fourier  coefficients 


ak  =  ~ 
7 T 

1 


*7 r 

-7T 

*7T 


/(#)  cos  kxdx 


> 


bk  =  —  /  /(x)sin/cxdx 

^  — 7T 


■»  0 


as 


k 


■>  oo. 


(3.118) 


Remark :  This  result  is  equivalent  to  the  decay  of  the  complex  Fourier  coefficients 


*7T 


cfc  = 


27T 


/0)e 


—  i  kx 


dx 


■>  0 


as 


k 


■>  oo, 


(3.119) 


-7T 


of  any  complex- valued  square-integrable  function. 


Convergence  of  the  sum  (3.117)  requires  that  the  coefficients  ck  not  tend  to  zero  too 
slowly.  For  instance,  requiring  the  power  bound  (3.99)  for  some  a  >  \  suffices  to  ensure 


oo 


that 


ck  <  oo.  Thus,  as  we  should  have  expected,  convergence  in  norm  of  the 

k  =  — oo 

Fourier  series  imposes  less-restrictive  requirements  on  the  decay  of  the  Fourier  coefficients 
than  uniform  convergence  —  which  needed  a  >  1.  Indeed,  a  Fourier  series  with  slowly 
decaying  coefficients  may  very  well  converge  in  norm  to  a  discontinuous  L2  function,  which 
is  not  possible  under  uniform  convergence. 


Completeness 


Calculations  in  vector  spaces  rely  on  the  specification  of  a  basis,  meaning  a  set  of  linearly 
independent  elements  that  span  the  space.  The  choice  of  basis  serves  to  introduce  a  system 
of  local  coordinates  on  the  space,  namely,  the  coefficients  in  the  expression  of  an  element 
as  a  linear  combination  of  basis  elements.  Orthogonal  and  orthonormal  bases  are  partic¬ 
ularly  handy,  since  the  coordinates  are  immediately  calculated  by  taking  inner  products, 
while  general  bases  require  solving  linear  systems.  In  finite-dimensional  vector  spaces,  all 
bases  contain  the  same  number  of  elements,  which,  by  definition,  is  the  dimension  of  the 
space.  A  vector  space  is,  therefore,  infinite-dimensional  if  it  contains  an  infinite  number 
of  linearly  independent  elements.  However,  the  question  when  such  a  collection  forms  a 
basis  for  the  space  is  considerably  more  delicate,  and  mere  counting  will  no  longer  suffice. 
Indeed,  omitting  a  finite  number  of  elements  from  an  infinite  collection  would  still  leave  an 
infinite  number,  but  the  latter  will  certainly  not  span  the  space.  Moreover,  we  cannot,  in 
general,  expect  to  write  a  general  element  of  an  infinite-dimensional  space  as  a  finite  linear 
combination  of  basis  elements,  and  so  subtle  questions  of  convergence  of  infinite  series  must 
also  be  addressed  if  we  are  to  properly  formulate  the  concept. 

The  definition  of  a  basis  of  an  infinite-dimensional  vector  space  rests  on  the  idea  of 
completeness.  We  shall  discuss  completeness  in  the  general  abstract  setting,  but  the  key 
example  is,  of  course,  the  Hilbert  space  L2[  —  7r,  tt ]  and  the  systems  of  trigonometric  or  com¬ 
plex  exponential  functions.  For  simplicity,  we  define  completeness  in  terms  of  orthonormal 
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systems  here.  (Similar  arguments  will  clearly  apply  to  orthogonal  systems,  but  normality 
helps  to  streamline  the  presentation.) 

Definition  3.41.  An  orthonormal  system  <^l5  <^2,  <£>3, . . .  £  V  is  called  complete  if, 
for  every  /  £  V,  its  generalized  Fourier  series  (3.108)  converges  in  norm  to  /: 

n 

>0,  as  n  ->  oo,  where  sn=^2  ckipk,  ck  =  (f,ipk ),  (3.120) 

k  =  1 

is  the  nth  partial  sum  of  the  generalized  Fourier  series  (3.108). 

Thus,  completeness  requires  that  every  element  of  V  can  be  arbitrarily  closely  ap¬ 
proximated  (in  norm)  by  a  finite  linear  combination  of  the  basis  elements.  A  complete 
orthonormal  system  should  be  viewed  as  the  infinite-dimensional  version  of  an  orthonor¬ 
mal  basis  of  a  finite-dimensional  vector  space.  An  orthogonal  system  is  called  complete 
whenever  the  corresponding  orthonormal  system  obtained  by  dividing  the  elements  by 
their  norms  is  complete.  Existence  of  a  complete  orthonormal  system  is  directly  tied  to 
completeness  of  the  underlying  Hilbert  space. 

Determining  whether  a  given  orthonormal  or  orthogonal  system  of  functions  is  com¬ 
plete  is  a  difficult  problem,  and  requires  some  detailed  analysis  of  their  properties.  The 
key  result  for  classical  Fourier  series  is  that  the  trigonometric  functions,  or,  equivalently, 
the  complex  exponentials,  form  a  complete  system;  an  indication  of  its  proof  will  appear 
below.  A  general  characterization  of  complete  orthonormal  eigenfunction  systems  can  be 
found  in  Section  9.4. 


Theorem  3.42.  The  trigonometric  functions  1,  cos kx,  sin kx,  k  =  1,2,3,..., 
form  a  complete  orthogonal  system  in  L2  =  L2[  —  7r,  tt].  In  other  words ,  if  sn(x)  denotes 
the  nth  partial  sum  of  the  Fourier  series  of  the  square-integrable  function  f(x)  £  L2,  then 


lim 

n  — >  oo 


/  -  Sn  II  =  o. 


To  better  comprehend  completeness,  let  us  describe  some  equivalent  characterizations 
and  consequences.  One  is  the  infinite-dimensional  counterpart  of  formula  (B.27)  for  the 
norm  of  a  vector  in  terms  of  its  coordinates  with  respect  to  an  orthonormal  basis. 


Theorem  3.43.  The  orthonormal  system  <^2,  <^3, . . .  £  V  is  complete  if  and  only 
if  Plancherel’s  formula 


holds  for  every  f  £  V . 


E 


k=  1 


E  (f^k)2 

k=  1 


(3.121) 


Proof :  Theorem  3.43,  thus,  states  that  the  system  of  functions  is  complete  if  and  only 
if  the  Bessel  inequality  (3.117)  is,  in  fact,  an  equality.  Indeed,  letting  n  oo  in  (3.115), 
we  find 


lim  ||  /  -  sn 

n  — >  oo 


n 

lim  \°k 

n  — y  oo  A ' 
k=  1 


oo 


E 

k=  1 


C 


k 


Therefore,  the  completeness  condition  (3.120)  holds  if  and  only  if  the  right-hand  side 
vanishes,  which  is  the  Plancherel  identity  (3.121).  Q.E.D. 
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An  analogous  result  holds  for  the  inner  product  between  two  elements,  which  we  state 
in  its  general  complex  form,  although  the  proof  given  here  is  for  the  real  version;  in  Exercise 
3.5.35  the  reader  is  asked  to  supply  the  slightly  more  intricate  complex  proof. 


Corollary  3.44.  The  Fourier  coefficients  ck  =  (f,y>k),  dk 
f,gGV  satisfy  Parseval’s  formula 

oo 

(f,g)  =  E  caA- 

k=  1 

Proof :  Since,  for  a  real  inner  product, 


(f,g) 


f  +  g II2  -  \\f  - g\\2), 


(9,<Pk)>  of  any 

(3.122) 


(3.123) 


Parseval’s  formula  results  from  applying  PlanchereFs  formula  (3.121)  to  each  term  on  the 
right-hand  side: 


(f  ,9 


CX) 

iE  [( 


OO 


+  dky 


ck  dk  > 


k=  1 


k=  1 


which  agrees  with  (3.122),  since  we  are  assuming  that  dk  =  dk  are  all  real.  Q.E.D. 

Note  that  PlanchereFs  formula  is  a  special  case  of  ParsevaFs  formula, ^  obtained  by 
setting  f  =  g.  In  the  particular  case  of  the  complex  exponential  basis  elkx  of  L2[  — tt,  7r], 
the  Plancherel  and  Parseval  formulae  take  the  form 


1 

2  7 r 


f{x)  |2  dx  = 


E 

k  —  —  oo 


*7 r 


2t r 


f[x)  g(x)  dx 


-7T 


E 

k  —  —oo 


ck  d 


k  j 


(3.124) 


where  ck  =  (  /  ,  elkx  ) ,  dk  =  (g  ,  elkx  )  are  the  ordinary  Fourier  coefficients  of  the  complex¬ 
valued  functions  f{x)  and  g{x).  In  Exercise  3.5.38,  you  are  asked  to  write  the  corresponding 
formulas  for  the  real  Fourier  coefficients. 

Completeness  also  tells  us  that  a  function  is  uniquely  determined  by  its  Fourier  coef¬ 
ficients. 


Proposition  3.45.  If  the  orthonormal  system  (p2, . . .  E  V  is  complete ,  then  the 
only  element  f  E  V  with  all  zero  Fourier  coefficients ,  0  =  cx  —  c2  =  •  •  • ,  is  the  zero  element: 
/  =  0.  More  generally,  two  elements  f,gGV  have  the  same  Fourier  coefficients  if  and  only 
if  they  are  the  same:  f  =  g. 

Proof:  The  proof  is  an  immediate  consequence  of  PlanchereFs  formula.  Indeed,  if 
ck  —  0,  then  (3.121)  implies  that  \\  f  \\  =0  and  hence  /  =  0.  The  second  statement  follows 
by  applying  the  first  to  their  difference  /  —  g.  Q.E.D. 

Another  way  of  stating  this  result  is  that  the  only  function  that  is  orthogonal  to  every 
element  of  a  complete  orthonormal  system  is  the  zero  function.^-  In  other  words,  a  complete 
orthonormal  system  is  maximal  in  the  sense  that  no  further  orthonormal  elements  can  be 
appended  to  it. 


^  Curiously,  Marc- Antoine  Parseval  des  Chenes’  contribution  slightly  predates  Fourier,  whereas 
Michel  PlanchereFs  appeared  almost  a  century  later. 

Or,  to  be  more  technically  accurate,  any  function  that  is  zero  outside  a  set  of  measure  zero. 
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Let  us  now  discuss  the  completeness  of  the  Fourier  trigonometric  and  complex  expo¬ 
nential  functions.  We  shall  establish  the  completeness  property  only  for  sufficiently  smooth 
functions,  leaving  the  harder  general  proof  to  the  references,  [37,  128]. 

According  to  Theorem  3.30,  if  f(x)  is  continuous,  27r  periodic,  and  piecewise  C1,  its 
Fourier  series  converges  uniformly, 

oo 

f{x )  =  ckelkx  for  all 

k  =  —  oo 

The  same  holds  for  its  complex  conjugate  f(x).  Therefore, 


oo 


oo 


/O)  I2  =  f(x)  f(x)  =  f(x ) 


-  —  i  kx 

Ck  e 


=  ckf(x) 


—  i  kx 


k  —  —oo  k  —  —oo 

which  also  converges  uniformly  by  (3.94).  Formula  (3.95)  permits  us  to  integrate  both 
sides  from  —  n  to  7r,  yielding 


oo 

f(x)  1 2dx=  ^ 

k  =  —  oo 


2k_ 

2n 


OO 


k  x 


dx  = 


E 

k  =  —oo 


oo 

°k  ck  — 

k  =  —oo 


Therefore,  Plancherel’s  formula  (3.121)  holds  for  any  continuous,  piecewise  C1  function. 

With  some  additional  technical  work,  this  result  is  used  to  establish  the  validity  of 
Plancherel’s  formula  for  all  /  £  L2,  the  key  step  being  to  suitably  approximate  /  by  such 
continuous,  piecewise  C1  functions.  With  this  in  hand,  completeness  is  an  immediate 
consequence  of  Theorem  3.43.  Q.E.D. 


Pointwise  Convergence 


Let  us  finally  return  to  the  Pointwise  Convergence  Theorem  3.8  for  the  trigonometric 
Fourier  series.  The  goal  is  to  prove  that,  under  the  appropriate  hypotheses  on  /(x),  namely 
27r-periodic  and  piecewise  C1,  the  limit  of  its  partial  Fourier  sums  is 


lim  sn{x)  =  \  [f(x+)  +  f(x  ) 

Ti  — y  oo 


(3.125) 


We  begin  by  substituting  the  formulae  (3.65)  for  the  complex  Fourier  coefficients  into  the 
formula  (3.109)  for  the  nth  partial  sum: 


To  proceed  further,  we  need  to  calculate  the  final  summation 

n 

•••  +e“ix  +  l  +  eix+  •••  +  einx . 

k  =  —n 


(3.126) 
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This,  in  fact,  has  the  form  of  a  geometric  sum, 


rn 


E 

k  =  0 


ark  =  a  +  ar  +  ar2  + 


+  a  r 


rn 


r 


ra+l 


-  1 


a 


r  —  1 


(3.127) 


with  m  +  l  =  2n+l  summands,  initial  term  a  =  e  inx,  and  ratio  r  =  elx .  Therefore, 


n 


E 


kx  _  —  i  nx 

\-y 


e  i  (2n+l)x  _  2 


i(n+l)x  _  p-  inx 


1  x 


k  =  —n 


-  1 


1  x 


-  1 


(n+^)x  _  -i  (n+|) 


X 


sin 


(»  +  !) 


(3.128) 


xj 2  _ 


x  j  2 


sm  g  x 


In  this  computation,  to  pass  from  the  first  to  the  second  line,  we  multiplied  numerator  and 
denominator  by  e~ 1X/2,  after  which  we  used  the  formula  (3.60)  for  the  sine  function  in  terms 
of  complex  exponentials.  Incidentally,  (3.128)  is  equivalent  to  the  intriguing  trigonometric 
summation  formula 


1  +  2(cosx  +  cos2x  +  cos3x  +  •••  +  cos  nx) 

Therefore,  substituting  back  into  (3.126),  we  obtain 

1 


sin  (n  +  |)  x 


sm  2  x 


(3.129) 


Sn(X)  = 


2tt 

1 

2tt 


sin  (n  +  U  (x  —  y ) 

/(i/)  ,  T  ^ 


—  IT 

‘7T  —  X 

—  7T  —  X 


sin  |  (x  —  y) 

sin  (n+  \)y  1 

f\x  +  y)  — — rz — dy  = 


*7 r 


2tt 


sin  (n  +  4)  y 

f(x  +  y) - — ; - dy. 


sin  |  -/-7T  sin  |  y 

The  second  equality  is  the  result  of  changing  the  integration  variable  to  y  =  x  —  y  and 
canceling  the  minus  signs  in  the  resulting  trigonometric  fraction,  while  the  final  equality 
follows  since  the  integrand  is  27r-periodic,  and  so  its  integrals  over  any  interval  of  length 
2tt  all  have  the  same  value;  see  Exercise  3.2.9. 

Thus,  to  prove  (3.125),  it  suffices  to  show  that 


lim  — 

n^oo  7T  jq 

lim  — 

n->oo  7 r 


7 r 


sin  (n+\)y 


f{x  +  y) - v.  1  "  dy  =  f(x+), 

sm  |  y 

0  sin  (n+  h)  y 

fix  +  V ) - ^-i - dy  =  f(x~) 


(3.130) 


—  7 T 


sm  2*/ 

The  proofs  of  the  two  formulas  are  identical,  and  so  we  concentrate  on  establishing  the 
first.  Using  the  fact  that  the  integrand  is  even,  and  then  our  summation  formula  (3.128) 
in  reverse,  yields 


7 r 


47r  sin  [n  +  |)  y  \ 

-dy  = 


o 


sm  2?/ 


2  7 r 


>7r  sin  (n  +  ^  y 
sin  i  y 


-j  /-7T  n 

dy  =  —  J  ^2  elky  dy  =  1, 


k  =  — n 


because  only  the  constant  term  has  a  nonzero  integral.  Multiplying  this  formula  by  f(x+) 
and  then  subtracting  the  result  from  the  first  formula  in  (3.130)  leads  to 


,.  1  r  fix  +  y)  -  fix+)  ■  (  ,  n  J  n 

Inn  —  /  - 7 - sm  n  +  £  n/  dy  =  0. 

n  ^  oo  7T  y0  sin  1 2/  V  2 


(3.131) 
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which  we  now  proceed  to  prove. 

We  claim  that,  for  each  fixed  value  of  x,  the  function 


f{x  +  y)  -  fjx+) 

sin  iy 


is  piecewise  continuous  for  all  0  <  y  <  n.  Owing  to  our  hypotheses  on  /(x),  the  only 
problematic  point  is  at  y  =  0,  but  then,  by  THopital’s  Rule  (for  one-sided  limits), 


lim 

o+ 


lim 

o+ 


fjx±y)  ~/(^+) 

sin  \  y 


—  lim 

y^0+ 


fj£  ±  y) 

\cos\y 


2  f\x+). 


Consequently,  (3.131)  will  be  established  if  we  can  show  that 


lim  — 

n  — >  oo  7 r 


/•7T 

I  g(y)  sin  (n+\)y  dy  =  0 
0 


(3.132) 


whenever  g  is  piecewise  continuous.  Were  it  not  for  the  extra  | ,  this  would  immediately 
follow  from  the  simplified  Riemann-Lebesgue  Lemma  3.40.  More  honestly,  we  can  invoke 
the  addition  formula  for  sin  (n  +  |)  y  to  write 


i  r 

-J  g(y)  sin  (n+  ±)  ydy 


If 71  1^ 

-  (g{y)  sin  |  y)  cosnydy  H —  /  (y(y)  cos  \y)  sin  nydy. 

71  Jo  71  Jo 


The  first  integral  is  the  nth  Fourier  cosine  coefficient  for  the  piecewise  continuous  function 
g(y)sm^y,  while  the  second  integral  is  the  nth  Fourier  sine  coefficient  for  the  piecewise 
continuous  function  g{y)  cos  ^y.  Lemma  3.40  implies  that  both  of  these  converge  to  zero 
as  n  oo,  and  hence  (3.132)  holds.  This  completes  the  proof,  thus  establishing  pointwise 
convergence  of  the  Fourier  series.  Q.E.D. 


Remark :  An  alternative  approach  to  the  last  part  of  the  proof  is  to  use  the  general 
Riemann-Lebesgue  Lemma ,  whose  proof  can  be  found  in  [37,  128]. 


Lemma  3.46.  Suppose  g(x)  is  piecewise  continuous  on  [a,  b].  Then 


0  =  lim 

UJ  — >  oo 


g(x)  eluJX  dx 


a 


r*b  /*b 

lim  /  g(x)  coscnx  dx  +  i  lim  /  g(x)siiujxdx 

UJ  — >  OO  /  UJ  OO  / 

CL  CL 


(3.133) 


Intuitively,  the  Riemann-Lebesgue  Lemma  says  that,  as  the  frequency  oj  gets  larger 
and  larger,  the  increasingly  rapid  oscillations  of  the  integrand  tend  to  cancel  each  other 
out. 


Remark :  While  the  Fourier  series  of  a  merely  continuous  function  need  not  converge 
pointwise  everywhere,  a  deep  theorem,  proved  by  the  Swedish  mathematician  Lennart 
Carleson  in  1966,  [28],  states  that  the  set  of  points  where  it  does  not  converge  has  measure 
zero,  and  hence  the  exceptional  points  form  a  very  small  subset. 
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Exercises 


3.5.26.  Which  of  the  following  sequences  converge  in  norm  to  the  zero  function  on  R? 


(a)  vn{x) 


nx 


(t>)  Vn(x) 


1,  n  <  x  <  n  +  1, 
0,  otherwise, 


1  +  n2  x2 

,  x  /  \  f  1,  n<x<n  +  l/n,  /A  ,  .  f  1/n,  n<x<2n 

(c)  v„(x)  =  s  ^  ,  .  (a)  mux)  =  <  ' 

\  0,  otherwise,  \  0,  otherwise, 

/  x  /x  /  1/y/n,  n  <  x  <  2n,  m  /  x 

(e)  ”»(*)  =  (  0,  otherwise,  <«  ’»<*>  = 


2  2 

n  x  —  1. 


—  1/n  <  x  <  1/n. 
otherwise. 


3.5.27.  Discuss  pointwise  and  L2  convergence  of  the  following  sequences  on  the  interval  [0, 1]: 


(a)  1 


x‘ 


n 


2  5 


ip) 


n,  1/n2  <  x  <  1/n,  ,x  g- 
x,  otherwise, 


nx 


(d)  sin  nx. 


3.5.28.  Prove,  directly  from  the  definition,  the  convergence  in  norm  of  the  Fourier  series  (3.49) 
of  the  step  function. 

3.5.29.  Let  /(x)  E  L2[a,6]  be  square  integrable.  Which  constant  function  g(x)  =  c  best  ap¬ 
proximates  /  in  the  least  squares  sense? 


3.5.30.  Suppose  the  sequence  fn(x)  converges  pointwise  to  a  function  f*(x)  on  an  interval  [a,  6], 

c\ 

and  converges  to  g*(x)  in  the  L  norm  on  [a,  b}.  Is  f*(x)  =  g*(x)  at  every  a  <  x  <bl 


3.5.31.  Find  a  formula  for  the  L2  norm  of  the  Fourier  series  in  Exercises  3.5.20  and  3.5.22. 

3.5.32.  Under  what  conditions  on  the  function  f(x)  is  the  least  squares  error  due  to  the  nth 
order  Fourier  partial  sum  equal  to  zero:  ||  /  —  sn  ||  =0? 

0  3.5.33.  Let  V  be  an  inner  product  space.  Prove  that  the  elements  of  a  (finite  or  infinite)  or¬ 
thonormal  system  (pl5  <^2,  •  •  •  E  V  are  linearly  independent,  meaning  that  any  finite  linear 
combination  vanishes,  c-^cp-^  +  •  •  •  +  cn(pn  =  0,  if  and  only  if  the  coefficients  are  all  zero: 

C1  =  •  •  •  =  cn  =  °- 


❖ 


3.5.34.  Let  V  be  a  complex  inner  product  space.  Prove  that,  for  all  /,  g  E  V. 


(a)  11/  +  9 

(t>)  (f,g) 


f  II  +  2  Re  ( / ,  g )  +  ||  g 


\  (11/ +  5 


/  -  9  II  +  i  II  /  +  iff 


/  -  iff 


0  3.5.35.  Let  V  be  an  infinite-dimensional  complex  inner  product  space,  and  ipk  E  V  a  complete 
orthonormal  system.  Prove  the  corresponding  Plancherel  and  Parseval  formulas. 

Hint:  Use  the  identities  in  Exercise  3.5.34. 


3.5.36.  What  does  Plancherel’s  formula  (3.121)  tell  us  in  a  finite-dimensional  vector  space? 
What  about  Parseval’s  formula  (3.122)? 

3.5.37.  Let  f(x)  =  x,  g{x)  =  signx.  (a)  Write  out  Plancherel’s  formula  for  the  complex 
Fourier  coefficients  of  /.  (b)  Write  out  Plancherel’s  formula  for  the  complex  Fourier  coef¬ 
ficients  of  g.  (c)  Write  out  Parseval’s  formula  for  the  complex  Fourier  coefficients  of  /,  g. 

0  3.5.38.  (a)  Prove  the  real  version  of  the  Plancherel  formula 

-  [  |  fix)  I2  dx  =  iflg  +  J2  ial  +  bt )  (3.134) 

71  J  71  k=  1 

for  the  trigonometric  Fourier  coefficients  of  a  real  function  f(x). 

(b)  What  is  the  real  version  of  Parseval’s  formula? 

3.5.39.  Give  an  alternative  proof  of  formula  (3.129)  that  does  not  require  complex  functions  by 
first  multiplying  through  by  sin  ^  x  and  then  invoking  a  suitable  trigonometric  identity  for 
the  product  terms. 
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3.5.40.  (a)  Prove  that  the  functions  <pn(x)  =  sin(n  —  for  n  —  1,  2,  3, ...  ,  form  an  orthogo- 

nal  sequence  on  the  interval  [0,  tt  ]  relative  to  the  L  inner  product  (/,</)  =  /  f(x)  g(x)  dx. 

(b)  Find  the  formula  for  the  Fourier  coefficients  of  a  function  f(x)  relative  to  the  orthogo¬ 
nal  sequence  <pn(x ).  (c)  State  Bessel’s  inequality  and  Plancherel’s  formula  in  this  case. 
Carefully  state  any  hypotheses  that  might  be  required  for  the  validity  of  your  formulas. 

0  3.5.41.  Prove  that  a  sequence  of  vectors  E  Rm  converges  in  the  Euclidean  norm, 


vi 


v 

(n) 


(n) 


0  as  n  — oo,  if  and  only  if  their  individual  components  converge: 

— v*  for  i  =  1, . . . ,  m. 

0  3.5.42.  Let  V  be  a  normed  vector  space.  A  sequence  vn  E  V  is  called  a  Cauchy  sequence  if  for 


every  6  >  0  there  exists  an  N  such  that 
that  a  sequence  that  converges  in  norm, 


— 

m 

n 

V 

n 

-  V* 

<  6  whenever  both  m,n  >  N.  Prove 
— 0  as  n  — oo,  is  necessarily  a  Cauchy 


sequence.  Remark:  A  normed  vector  space  is  called  complete  if  every  Cauchy  sequence 
converges  in  norm.  It  can  be  proved,  [96,98],  that  any  finite-dimensional  normed  vector 
space  is  complete,  but  this  is  not  necessarily  the  case  in  infinite  dimensions.  For  example, 
the  vector  spaces  consisting  of  all  trigonometric  polynomials  and  of  all  polynomials  are  not 
complete  in  the  L2  norm.  The  most  important  example  of  a  complete  infinite-dimensional 

o 

vector  space  is  the  Hilbert  space  L  . 


k 


k+1 


2  —  <  x  <  _ 

0  3.5.43.  For  each  n  =  1,  2, ...  ,  define  the  function  f  (x)  =  <  ’  m  —  —  m  ’  where 

\  0,  otherwise, 

"I 

n  =  +  1)  +  k  and  0  <  k  <  m.  Show  first  that  m,  k  are  uniquely  determined  by  n. 

Then  prove  that,  on  the  interval  [0,  1],  the  sequence  fn(x)  converges  in  norm  to  0  but  does 
not  converge  pointwise  anywherel 


d2u 


C  3.5.44.  Let  u(t,  x)  solve  the  initial  value  problem  — - ^ 

otz 

for  — oo  <  x  <  oo,  where  f(x)  0  as 


c 


4a  u(0,x)  =  f(x),  -^7  (0,  x)  =  0, 


X  oo. 


dx 2  ’  7  J  K  n  dt 

True  or  false:  As  t  oo,  the  solution 


u(t,x)  converges  to  an  equilibrium  solution  (a)  pointwise;  (b)  uniformly;  (c)  in  norm. 

du 

C  3.5.45.  Answer  Exercise  3.5.44  for  the  initial  conditions  u{ 0,  x)  =  0,  —  (0,  x)  =  g(x),  with 


g{x)  0  as 


x 


dt 


— y  oo. 


Chapter  4 

Separation  of  Variables 


Three  cardinal  linear  second-order  partial  differential  equations  have  collectively  driven  the 
development  of  the  entire  subject.  The  first  two  we  have  already  encountered:  The  wave 
equation  describes  vibrations  and  waves  in  continuous  media,  including  sound  waves,  water 
waves,  elastic  waves,  electromagnetic  waves,  and  so  on.  The  heat  equation  models  diffusion 
processes,  including  thermal  energy  in  solids,  solutes  in  liquids,  and  biological  populations. 
Third,  and  in  many  ways  the  most  important  of  all,  is  the  Laplace  equation  and  its  inho¬ 
mogeneous  counterpart,  the  Poisson  equation ,  which  govern  equilibrium  mechanics.  The 
latter  two  equations  arise  in  an  astonishing  variety  of  mathematical  and  physical  contexts, 
ranging  through  elasticity  and  solid  mechanics,  fluid  mechanics,  electromagnetism,  poten¬ 
tial  theory,  thermomechanics,  geometry,  probability,  number  theory,  and  many  other  fields. 
The  solutions  to  the  Laplace  equation  are  known  as  harmonic  functions,  and  the  discov¬ 
ery  of  their  many  remarkable  properties  forms  one  of  the  most  celebrated  chapters  in  the 
history  of  mathematics.  All  three  equations,  along  with  their  multi-dimensional  kin,  will 
appear  repeatedly  throughout  this  text. 

The  aim  of  the  current  chapter  is  to  develop  the  method  of  separation  of  variables 
for  solving  these  key  partial  differential  equations  in  their  two-independent-variable  incar¬ 
nations.  For  the  wave  and  heat  equations,  the  variables  are  time,  t,  and  a  single  space 
coordinate,  x,  leading  to  initial-boundary  value  problems  modeling  the  dynamical  behav¬ 
ior  of  a  one-dimensional  medium.  For  the  Laplace  and  Poisson  equations,  both  variables 
represent  space  coordinates,  x  and  y,  and  the  associated  boundary  value  problems  model 
the  equilibrium  configuration  of  a  planar  body,  e.g.,  the  deformations  of  a  membrane.  Sep¬ 
aration  of  variables  seeks  special  solutions  that  can  be  written  as  the  product  of  functions 
of  the  individual  variables,  thereby  reducing  the  partial  differential  equation  to  a  pair  of 
ordinary  differential  equations.  More-general  solutions  can  then  be  expressed  as  infinite 
series  in  the  appropriate  separable  solutions.  For  the  two- variable  equations  considered 
here,  this  results  in  a  Fourier  series  representation  of  the  solution.  In  the  case  of  the  wave 
equation,  separation  of  variables  serves  to  focus  attention  on  the  vibrational  character  of 
the  solution,  whereas  the  earlier  d’Alembert  approach  emphasizes  its  particle-like  aspects. 
Unfortunately,  for  the  Laplace  equation,  separation  of  variables  applies  only  to  boundary 
value  problems  in  very  special  geometries,  e.g.,  rectangles  and  disks.  Further  development 
of  the  separation  of  variables  method  for  solving  partial  differential  equations  in  three  or 
more  variables  can  be  found  in  Chapters  11  and  12. 

In  the  final  section,  we  take  the  opportunity  to  summarize  the  fundamental  tripar¬ 
tite  classification  of  planar  second-order  partial  differential  equations.  Each  of  the  three 
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paradigmatic  equations  epitomizes  one  of  the  classes:  hyperbolic ,  such  as  the  wave  equa¬ 
tion;  parabolic ,  such  as  the  heat  equation;  and  elliptic ,  such  as  the  Laplace  and  Poisson 
equations.  Each  category  enjoys  its  own  distinctive  properties  and  features,  both  analytic 
and  numeric,  and,  in  effect,  forms  a  separate  mathematical  subdiscipline. 


4.1  The  Diffusion  and  Heat  Equations 


Let  us  begin  with  a  brief  physical  derivation  of  the  heat  equation  from  first  principles. 
We  consider  a  bar  —  meaning  a  thin,  heat-conducting  body.  “Thin”  means  that  we  can 
regard  the  bar  as  a  one-dimensional  continuum  with  no  significant  transverse  temperature 
variation.  We  will  assume  that  the  bar  is  fully  insulated  along  its  length,  and  so  heat  can 
enter  (or  leave)  only  through  its  uninsulated  endpoints.  We  use  t  to  represent  time,  and 
a  <  x  <  b  to  denote  spatial  position  along  the  bar,  which  occupies  the  interval  [a,  b].  Our 
goal  is  to  find  the  temperature  u(t,  x)  of  the  bar  at  position  x  and  time  t. 

The  dynamical  equations  governing  the  temperature  are  based  on  three  fundamental 
physical  principles.  First  is  the  Law  of  Conservation  of  Heat  Energy.  Recalling  the  general 
Definition  2.7,  this  particular  conservation  law  takes  the  form 


de 

dt 


+ 


dw 

dx 


in  which  e(t,  x)  represents  the  thermal  energy  density  at  time  t  and  position  x,  while 
w(t,x)  denotes  the  heat  flux ,  i.e.,  the  rate  of  flow  of  thermal  energy  along  the  bar.  Our 
sign  convention  is  that  w(t,x)  >  0  at  points  where  the  energy  flows  in  the  direction  of 
increasing  x  (left  to  right).  The  integrated  form  (2.49)  of  the  conservation  law,  namely 


d_ 

dt 


e(t,  x)  dx 


w(t,  a)  —  w(t,  6), 


states  that  the  rate  of  change  in  the  thermal  energy  within  the  bar  is  equal  to  the  total 
heat  flux  passing  through  its  uninsulated  ends.  The  signs  of  the  boundary  terms  confirm 
that  heat  flux  into  the  bar  results  in  an  increase  in  temperature. 

The  second  ingredient  is  a  constitutive  assumption  concerning  the  bar’s  material  prop¬ 
erties.  It  has  been  observed  that,  under  reasonable  conditions,  thermal  energy  is  propor¬ 
tional  to  temperature: 

e(t,  x)  =  <t(x)  u(t,  x).  (4-3) 

The  factor 

<t(x)  =  p(x)  x(x)  >  0  (4-4) 


is  the  product  of  the  density  p  of  the  material  and  its  specific  heat  capacity  y,  which  is 
the  amount  of  heat  energy  required  to  raise  the  temperature  of  a  unit  mass  of  the  material 
by  one  degree.  Note  that  we  are  assuming  that  the  medium  is  not  changing  in  time,  and 
so  physical  quantities  such  as  density  and  specific  heat  depend  only  on  position  x.  We 
also  assume,  perhaps  with  less  physical  justification,  that  its  material  properties  do  not 
depend  upon  the  temperature;  otherwise,  we  would  be  forced  to  deal  with  a  much  thornier 
nonlinear  diffusion  equation,  [70,99]. 

The  third  physical  principle  relates  heat  flux  and  temperature.  Physical  experiments 
show  that  the  thermal  energy  moves  from  hot  to  cold  at  a  rate  that  is  in  direct  proportion  to 
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the  temperature  gradient,  which,  in  the  one-dimensional  case,  means  its  derivative  du/dx. 
The  resulting  relation 


w(t,  x)  =  —  n{x) 


du 

dx 


(4.5) 


is  known  as  Fourier’s  Law  of  Cooling.  The  proportionality  factor  n{x)  >  0  is  the  thermal 
conductivity  of  the  bar  at  position  x,  and  the  minus  sign  reflects  the  everyday  observation 
that  heat  energy  moves  from  hot  to  cold.  A  good  heat  conductor,  e.g.,  silver,  will  have 
high  conductivity,  while  a  poor  conductor,  e.g.,  glass,  will  have  low  conductivity. 

Combining  the  three  laws  (4.1,  3,  5)  produces  the  linear  diffusion  equation 


d 

dt 


(  cr(x)  u ) 


d_ 

dx 


a  <  x  <  6, 


governing  the  thermodynamics  of  a  one-dimensional  medium.  It  is  also  used  to  model  a 
wide  variety  of  diffusive  processes,  including  chemical  diffusion,  diffusion  of  contaminants 
in  liquids  and  gases,  population  dispersion,  and  the  spread  of  infectious  diseases.  If  there 
is  an  external  heat  source  along  the  length  of  the  bar,  then  the  diffusion  equation  acquires 
an  additional  prescribed  inhomogeneous  term: 


d 

dt 


( cr(x)  u ) 


d_ 

dx 


+  h(t,  x), 


a  <  x  <  b. 


In  order  to  uniquely  prescribe  the  solution  u(t,x),  we  need  to  specify  an  initial  tem¬ 
perature  distribution 

u(tQ,x)  =  /(x),  a  <  x  <  b.  (4.8) 

In  addition,  we  must  impose  a  suitable  boundary  condition  at  each  end  of  the  bar.  There 
are  three  common  types.  The  first  is  a  Dirichlet  boundary  condition ,  where  the  end  is  held 
at  a  prescribed  temperature.  For  example, 


u(t,a)=a(t)  (4-9) 

fixes  the  temperature  (possibly  time-varying)  at  the  left  end.  Alternatively,  the  Neumann 
boundary  condition 

j£(t,a)  =  n(t)  (4-10) 

prescribes  the  heat  flux  w(t,a)  =  —  K(a)ux(t,  a)  there.  In  particular,  a  homogeneous  Neu¬ 

mann  condition,  ux(t,  a)  =  0,  models  an  insulated  end  that  prevents  thermal  energy  flowing 
in  or  out.  The  Robin ^  boundary  condition , 

du 

—  (t,a)  +  /3(t)u(t,a)=r(t),  (4.11) 

models  the  heat  exchange  resulting  from  the  end  of  the  bar  being  placed  in  a  heat  bath 
(thermal  reservoir)  at  temperature  r(t). 

Each  end  of  the  bar  is  required  to  satisfy  one  of  these  boundary  conditions.  For 
example,  a  bar  with  both  ends  having  prescribed  temperatures  is  governed  by  the  pair  of 
Dirichlet  boundary  conditions 


u(t,a)  =  a(t),  u(t,b)  =  /?(£), 


(4.12) 


^  Since  it  is  named  after  the  nineteenth-century  French  analyst  Victor  Gustave  Robin,  the 
pronunciation  should  be  with  a  French  accent. 


124 


4  Separation  of  Variables 


whereas  a  bar  with  two  insulated  ends  requires  two  homogeneous  Neumann  boundary 
conditions 


du 

dx 


(t,a)  =  0, 


du 

dx 


(£,  b )  =  0. 


(4.13) 


Mixed  boundary  conditions,  with  one  end  at  a  fixed  temperature  and  the  other  insulated, 
are  similarly  formulated,  e.g., 


u(t,  a)  =  a(t),  —  ( t ,  b )  =  0.  (4-14) 

Finally,  the  periodic  boundary  conditions 

dn  du 

u(t,a)  =u(t,b),  —(t,a)  =  —(t,b),  (4.15) 

correspond  to  a  circular  ring  obtained  by  joining  the  two  ends  of  the  bar.  As  before,  we 
are  assuming  that  the  heat  is  allowed  to  flow  only  around  the  ring  —  insulation  prevents 
the  radiation  of  heat  from  one  side  of  the  ring  affecting  the  other  side. 


The  Heat  Equation 


In  this  book,  we  will  retain  the  term  “heat  equation”  to  refer  to  the  case  in  which  the 
bar  is  composed  of  a  uniform  material,  and  so  its  density  p,  conductivity  k,  and  specific 
heat  x  are  all  positive  constants.  We  also  exclude  external  heat  sources  (other  than  at  the 
endpoints),  meaning  that  the  bar  remains  insulated  along  its  entire  length.  Under  these 
assumptions,  the  general  diffusion  equation  (4.6)  reduces  to  the  homogeneous  heat  equation 

du  d2u 
dt  ^  dx2 

for  the  temperature  u(t,  x)  at  time  t  and  position  x.  The  constant 

u  _  u 

1  -  V  -  Jx 

is  called  the  thermal  diffusivity ;  it  incorporates  all  of  the  bar’s  relevant  physical  properties. 
The  solution  u(t,  x)  will  be  uniquely  prescribed  once  we  specify  initial  conditions  (4.8)  and 
a  suitable  boundary  condition  at  both  of  its  endpoints. 

As  we  learned  in  Section  3.1,  the  separable  solutions  to  the  heat  equation  are  based 
on  the  exponential  ansatz^ 

u(t,  x)  =  e~  xt  v(x),  (4-18) 

where  v{x)  depends  only  on  the  spatial  variable.  Functions  of  this  form,  which  “separate” 
into  a  product  of  a  function  of  t  times  a  function  of  x,  are  known  as  separable  solutions. 
Substituting  (4.18)  into  (4.16)  and  canceling  the  common  exponential  factors,  we  find  that 
v(x)  must  solve  the  second-order  linear  ordinary  differential  equation 


(4.17) 


(4.16) 


t  Anticipating  the  eventual  signs  of  the  eigenvalues,  and  to  facilitate  later  discussions,  we  now 
include  a  minus  sign  in  the  exponential  term. 
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Each  nontrivial  solution  v(x)  ^  0  is  an  eigenfunction ,  with  associated  eigenvalue  A,  for  the 
linear  differential  operator  L[v]  =  —  ryv"(x).  With  the  separable  eigensolutions  (4.18)  in 
hand,  we  will  then  be  able  to  reconstruct  the  desired  solution  u(t,  x )  as  a  linear  combination, 
or  rather  infinite  series,  thereof. 

Let  us  concentrate  on  the  simplest  case:  a  uniform,  insulated  bar  of  length  t  that  is 
held  at  zero  temperature  at  both  ends.  We  specify  its  initial  temperature  f[x)  at  time 
t0  =  0,  and  so  the  relevant  initial  and  boundary  conditions  are 


u(t,  0)  =  0,  u(t,£)  =  0,  t  >  0, 

n(0,  x)  =  /(x),  0  <  x  <  £. 

The  eigensolutions  (4.18)  are  found  by  solving  the  Dirichlet  boundary  value  problem 


(4.19) 


7  2 

7  — V-  +  Xv  =  0,  'u(O)  =  0,  v{£)  =  0.  (4.20) 

axz 

By  direct  calculation  (as  you  are  asked  to  do  in  Exercises  4.1.19-20),  one  finds  that  if  A 
is  either  complex,  or  real  and  nonpositive,  then  the  only  solution  to  the  boundary  value 
problem  (4.20)  is  the  trivial  solution  v{x)  =  0.  This  means  that  all  the  eigenvalues  must 
necessarily  be  real  and  positive.  In  fact,  the  reality  and  positivity  of  the  eigenvalues  need 
not  be  explicitly  checked.  Rather,  they  follow  from  very  general  properties  of  positive 
definite  boundary  value  problems,  of  which  (4.20)  is  a  particular  case.  See  Section  9.5  for 
the  underlying  theory  and  Theorem  9.34  for  the  relevant  result. 

When  A  >  0,  the  general  solution  to  the  differential  equation  is  a  trigonometric  func¬ 
tion 

v{x)  =  a  cosccx  +  b  sincax,  where  ca  =  y  A/7, 

and  a  and  b  are  arbitrary  constants.  The  first  boundary  condition  requires  t;(0)  =  a  =  0. 
This  serves  to  eliminate  the  cosine  term,  and  then  the  second  boundary  condition  requires 


v{£)  =  b  sinuii  =  0. 

Therefore,  since  we  require  b  7^  0  —  otherwise,  the  solution  is  trivial  and  does  not  qualify 
as  an  eigenfunction  —  uj£  must  be  an  integer  multiple  of  77  and  so 


7 r 

w  =  7’ 


27T 


37 r 


l  ’  l  ’ 

We  conclude  that  the  eigenvalues  and  eigenfunctions  of  the  boundary  value  problem  (4.20) 

2 


are 


v  (x)  =  sin 


nix  x 


A  =7  (—) 

n  1  \  £  )  >  ~nv~/  £ 

The  corresponding  eigensolutions  (4.18)  are 

7  n2  7T2 1 


n  —  1,2,  3, 


(4.21) 


7xn(t,  x)  =  exp 


nrxx 


sm 


e 


n  —  1,  2,  3, ...  . 


(4.22) 


Each  represents  a  trigonometrically  oscillating  temperature  profile  that  maintains  its  form 
while  decaying  to  zero  at  an  exponentially  fast  rate. 

To  solve  the  general  initial  value  problem,  we  assemble  the  eigensolutions  into  an 
infinite  series. 


00  00  / 

u(t,x)  =  Yi  bnun(t,x)  =Y  K  exP  (  - 

n— 1  n— 1  ^ 


7n27r2£ 
—£2 


nirx 


sm 


l 


(4.23) 
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whose  coefficients  bn  are  to  be  fixed  by  the  initial  conditions.  Indeed,  assuming  that  the 
series  converges,  the  initial  temperature  profile  is 

oo 

u(0,  x)  =  ^2  b»  sin  =  f(x).  (4.24) 

n—1 

This  has  the  form  of  a  Fourier  sine  series  (3.52)  on  the  interval  [0,£].  Thus,  the  coefficients 
are  determined  by  the  Fourier  formulae  (3.53),  and  so 

2  T)  7T  X 

hn  =  jJ  f{x)sm—^dx,  n  =  1,2,3,....  (4.25) 

The  resulting  formula  (4.23)  describes  the  Fourier  sine  series  for  the  temperature  u(t,x)  of 
the  bar  at  each  later  time  t  >  0. 

Example  4.1.  Consider  the  initial  temperature  profile 

—  x,  0  <  x  < 

(4.26) 

1  —  X,  Jq  <  X  <  1, 

on  a  bar  of  length  1,  plotted  in  the  first  graph  in  Figure  4.1.  Using  (4.25),  the  first  few 
Fourier  coefficients  of  f(x)  are  computed  (by  either  exact  or  numerical  integration)  to  be 

\  ^  .0897,  b2  «  -  .1927,  b3  «  -  .0289,  b4  =  0, 

b5  —.0162,  b6  .0132,  b7  .0104,  b8  =  0, 

The  resulting  Fourier  series  solution  to  the  heat  equation  is 

oo  oo 

u(t,x)  =  bnun(t,x)  =  bne~in  71  1  sinn7rx 

n— 1  n— 1 

~  .0897  e_7?r  1  sin nx  —  .1927  e-477r  1  sin  2nx  —  .0289  e-977r  1  sin  3nx  —  •  •  •  . 

In  Figure  4.1,  the  solution,  for  7  =  1,  is  plotted  at  some  representative  times.  Observe 
that  the  corners  in  the  initial  profile  are  immediately  smoothed  out.  As  time  progresses, 
the  solution  decays,  at  a  fast  exponential  rate  of  e_7r  t  e-9-87^  to  a  uniform,  zero  tem¬ 
perature,  which  is  the  equilibrium  temperature  distribution  for  the  homogeneous  Dirichlet 
boundary  conditions.  As  the  solution  decays  to  thermal  equilibrium,  the  higher  Fourier 
modes  rapidly  disappear,  and  the  solution  assumes  the  progressively  more  symmetric  shape 
of  a  single  sine  arc,  of  rapidly  decreasing  amplitude. 


n(0,  x)  =  f(x)  = 


Smoothing  and  Long-Time  Behavior 

The  fact  that  we  can  write  the  solution  to  an  initial-boundary  value  problem  in  the  form 
of  an  infinite  series  (4.23)  is  progress  of  a  sort.  However,  because  we  are  unable  to  sum  the 
series  in  closed  form,  this  “solution”  is  much  less  satisfying  than  a  direct,  explicit  formula. 
Nevertheless,  there  are  important  qualitative  and  quantitative  features  of  the  solution  that 
can  be  easily  gleaned  from  such  series  expansions. 
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t  —  0  t  —  .001  t  =  .01 


Figure  4.1. 


A  solution  to  the  heat  equation. 


If  the  initial  data  f(x )  is  integrable  (e.g.,  piecewise  continuous),  then  its  Fourier  coef¬ 
ficients  are  uniformly  bounded;  indeed,  for  any  n  >  1, 


2  fe 

b 

<  - 

n 

~  do 

f(x )  sin 


T17TX 


t 


dx  <  - 

'C 


f  |  f(x)  |  dx  =  M 

Jo 


(4.27) 


This  property  holds  even  for  quite  irregular  data.  Under  these  conditions,  each  term  in  the 
series  solution  (4.23)  is  bounded  by  an  exponentially  decaying  function 


K  exP 


p  t) 


sin 


nnx 

~~r 


<  M  exp 


P  t)  ' 


This  means  that,  as  soon  as  t  >  0,  most  of  the  high-frequency  terms,  n  0,  will  be 
extremely  small.  Only  the  first  few  terms  will  be  at  all  noticeable,  and  so  the  solution 
essentially  degenerates  into  a  finite  sum  over  the  first  few  Fourier  modes.  As  time  increases, 
more  and  more  of  the  Fourier  modes  will  become  negligible,  and  the  sum  further  degenerates 
into  fewer  and  fewer  significant  terms.  Eventually,  as  t  -T  oo,  all  of  the  Fourier  modes  will 
decay  to  zero.  Therefore,  the  solution  will  converge  exponentially  fast  to  a  zero  temperature 
profile:  u(£,  x)  -T  0  as  t  -T  oo,  representing  the  bar  in  its  final  uniform  thermal  equilibrium. 
The  fact  that  its  equilibrium  temperature  is  zero  is  the  result  of  holding  both  ends  of  the 
bar  fixed  at  zero  temperature,  whereby  any  initial  thermal  energy  is  eventually  dissipated 
away  through  the  ends.  The  small-scale  temperature  fluctuations  tend  to  rapidly  cancel 
out  through  diffusion  of  thermal  energy,  and  the  last  term  to  disappear  is  the  one  with  the 
slowest  decay,  namely 


f  f[x)  sin 
o 

For  generic  initial  data,  the  coefficient  bx  ^  0,  and  the  solution  approaches  thermal  equilib¬ 
rium  at  an  exponential  rate  prescribed  by  the  smallest  eigenvalue,  Ax  =  77 r2/£2,  which  is 
proportional  to  the  thermal  diffusivity  divided  by  the  square  of  the  length  of  the  bar.  The 


7 TX 


t 


dx.  (4.28) 


u(£,  x) 


q  exp 


77T 

~p 


7 TX 


sm 


t 


where 


61  =  1 
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Figure  4.2. 


Denoising  a  signal  with  the  heat  equation. 


longer  the  bar,  or  the  smaller  the  diffusivity,  the  longer  it  takes  for  the  effect  of  holding  the 
ends  at  zero  temperature  to  propagate  along  its  entire  length.  Also,  again  provided  b1  ^  0, 
the  asymptotic  shape  of  the  temperature  profile  is  a  small,  exponentially  decaying  sine  arc, 
just  as  we  observed  in  Example  4.1.  In  exceptional  situations,  namely  when  bx  =  0,  the 
solution  decays  even  faster,  at  a  rate  equal  to  the  eigenvalue  Xk  =  ykfr:2  /f2  corresponding 
to  the  first  nonzero  term,  bk  ^  0,  in  the  Fourier  series;  its  asymptotic  shape  now  oscillates 
k  times  over  the  interval. 

Another,  closely  related,  observation  is  that,  for  any  fixed  time  t  >  0  after  the  initial 
moment,  the  coefficients  in  the  Fourier  sine  series  (4.23)  decay  exponentially  fast  as  n  oo. 
According  to  Corollary  3.32,  this  implies  that  the  Fourier  series  converges  to  an  infinitely 
differentiable  function  of  x  at  each  positive  time  £,  no  matter  how  unsmooth  the  initial 
temperature  profile.  We  have  discovered  the  basic  smoothing  property  of  heat  flow,  which 
we  state  for  a  general  initial  time  t0. 

Theorem  4.2.  If  u(t,  x )  is  a  solution  to  the  heat  equation  with  piecewise  continuous 
initial  data  f(x)  =  u(t0,x),  or ,  more  generally ,  initial  data  satisfying  (4.27),  then ,  for  any 
t>t0,  the  solution  u(t,  x)  is  an  inhnitely  differentiable  function  of  x. 

In  other  words,  the  heat  equation  instantaneously  smoothes  out  any  discontinuities 
and  corners  in  the  initial  temperature  profile  by  fast  damping  of  the  high-frequency  modes. 
The  heat  equation’s  effect  on  irregular  initial  data  underlies  its  effectiveness  for  smoothing 
and  denoising  signals.  We  take  the  initial  data  u{ 0,x)  =  f{x)  to  be  a  noisy  signal,  and 
then  evolve  the  heat  equation  forward  to  a  prescribed  time  £*  >  0.  The  resulting  function 
g{x)  =  1/ (£*,#)  will  be  a  smoothed  version  of  the  original  signal  f{x)  in  which  most  of 
the  high-frequency  noise  has  been  eliminated.  Of  course,  if  we  run  the  heat  flow  for  too 
long,  all  of  the  low-frequency  features  will  also  be  smoothed  out  and  the  result  will  be 
a  uniform,  constant  signal.  Thus,  the  choice  of  stopping  time  £*  is  crucial  to  the  success 
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of  this  method.  Figure  4.2  shows  the  effect  of  running  the  heat  equation,^  with  7  =  1, 
on  a  signal  that  has  been  contaminated  by  random  noise.  Observe  how  quickly  the  noise 
is  removed.  By  the  final  time,  the  overall  smoothing  effect  of  the  heat  flow  has  caused 
significant  degradation  (blurring)  of  the  original  signal.  The  heat  equation  approach  to 
denoising  has  the  advantage  that  no  Fourier  coefficients  need  be  explicitly  computed,  nor 
does  one  need  to  reconstruct  the  smoothed  signal.  Basic  numerical  solution  schemes  for 
the  heat  equation  are  to  be  discussed  in  Chapter  5. 

An  important  theoretical  consequence  of  the  smoothing  property  is  that  diffusion  is  a 
one-way  process  —  one  cannot  run  time  backwards  and  accurately  infer  what  a  temperature 
distribution  looked  like  in  the  past.  In  particular,  if  the  initial  data  u(0,x)  =  f(x)  is  not 
smooth,  then  the  value  of  u(t,x)  for  any  t  <  0  cannot  be  defined,  because  if  u(t0,x)  were 
defined  and  integrable  at  some  t0  <  0  then,  by  Theorem  4.2,  u(t,  x )  would  be  smooth  at  all 
subsequent  times  t>t0,  including  t  —  0,  in  contradiction  to  our  assumption.  Moreover,  for 
most  initial  data,  the  Fourier  coefficients  in  the  solution  formula  (4.23)  are,  at  any  t  <  0, 
exponentially  growing  as  n  -T  oo,  indicating  that  high-frequency  noise  has  completely 
overwhelmed  the  solution,  thereby  precluding  any  kind  of  convergence  of  the  Fourier  series. 

Mathematically,  we  can  reverse  future  and  past  by  changing  t  to  —  t.  In  the  differential 
equation,  this  merely  reverses  the  sign  of  the  time-derivative  term;  the  x  derivatives  are 
unaffected.  Thus,  by  the  above  reasoning,  the  backwards  heat  equation 

du 
~dt 

is  an  ill-posed  problem  in  the  sense  that  small  changes  in  the  initial  data  —  e.g.,  a  small 
perturbation  of  a  high-frequency  mode  —  can  produce  arbitrarily  large  changes  in  the 
solution  arbitrarily  close  to  the  initial  time.  In  other  words,  the  solution  does  not  depend 
continuously  on  the  initial  data.  Even  worse,  for  nonsmooth  initial  data,  the  solution  is  not 
even  well  defined  in  forwards  time  t  >  0  (although  it  is  well-posed  if  we  run  t  backwards). 
The  same  holds  for  more  general  diffusion  processes,  e.g.,  (4.6).  If,  as  in  all  physically 
relevant  cases,  the  coefficient  of  uxx  is  everywhere  positive,  then  the  initial  value  problem 
is  well-posed  for  t  >  0,  but  ill-posed  for  t  <  0.  On  the  other  hand,  if  the  coefficient  is 
everywhere  negative,  the  reverse  holds.  A  coefficient  that  changes  signs  would  cause  the 
differential  equation  to  be  ill-posed  in  both  directions. 

While  theoretically  undesirable,  the  nnsmoothing  effect  of  the  backwards  heat  equa¬ 
tion  has  potential  benefits  in  certain  contexts.  For  example,  in  image  processing,  diffusion 
will  gradually  blur  an  image  by  damping  out  the  high-frequency  modes.  Image  enhance¬ 
ment  is  the  reverse  process,  and  can  be  based  on  running  the  heat  flow  backwards  in  some 
stable  manner.  In  forensics,  determining  the  time  of  death  based  on  the  current  temper¬ 
ature  of  a  corpse  also  requires  running  the  equations  governing  the  dissipation  of  body 
heat  backwards  in  time.  One  option  would  be  to  restrict  the  backwards  evolution  to  the 
first  few  Fourier  modes,  which  prevents  the  small-scale  fluctuations  from  overwhelming  the 
computation.  Ill-posed  problems  also  arise  in  the  reconstruction  of  subterranean  profiles 
from  seismic  data,  a  central  problem  of  the  oil  and  gas  industry.  These  and  other  applica¬ 
tions  are  driving  contemporary  research  into  how  to  cleverly  circumvent  the  ill-posedness 
of  backwards  diffusion  processes. 


7 


d2u 
dx 2 


with  a  negative  diffusion  coefficient 


-7  <  0. 


(4.29) 


^  To  avoid  artifacts  at  the  ends  of  the  interval,  we  are,  in  fact,  using  periodic  boundary 
conditions  in  the  plots.  Away  from  the  ends,  running  the  equation  with  Dirichlet  boundary 
conditions  leads  to  almost  identical  results. 
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Remark :  The  irreversibility  of  the  heat  equation,  along  with  the  irreversibility  of  non¬ 
linear  transport  in  the  presence  of  shock  waves  discussed  in  Section  2.3,  highlight  a  crucial 
distinction  between  partial  differential  equations  and  ordinary  differential  equations.  Or¬ 
dinary  differential  equations  are  always  reversible  —  the  existence,  uniqueness,  and  con¬ 
tinuous  dependence  properties  of  solutions  are  all  equally  valid  in  reverse  time  (although 
their  detailed  qualitative  and  quantitative  properties  will,  of  course,  depend  upon  whether 
time  is  running  forwards  or  backwards).  The  irreversibility  and  ill-posedness  of  partial 
differential  equations  modeling  thermodynamical,  biological,  and  other  diffusive  processes 
in  our  universe  may  explain  why  Time’s  Arrow  points  exclusively  to  the  future. 


The  Heated  Ring  Redux 


Let  us  next  consider  the  periodic  boundary  value  problem  modeling  heat  flow  in  an  in¬ 
sulated  circular  ring.  We  fix  the  length  of  the  ring  to  be  £  =  27r,  with  —  tt  <  x  <  tt 
representing  the  “angular”  coordinate  around  the  ring.  For  simplicity,  we  also  choose  units 
in  which  the  thermal  diffusivity  is  7  =  1.  Thus,  we  seek  to  solve  the  heat  equation 


du  d2u 
dt  dx 2  ’ 

subject  to  periodic  boundary  conditions 


—  7 r  <  x  <  7 r, 


t  >  0, 


(4.30) 


u(t, -7T)  =  u(t,n),  —  (t, -7T)  =  —  (t,n),  t>  0,  (4-31) 

that  ensure  continuity  of  the  solution  when  the  angular  coordinate  switches  from  —  n  to  7r. 
The  initial  temperature  distribution  is 


n(0,  x)  =  /(x),  —7T<x<n.  (4.32) 

The  resulting  temperature  u(t,x)  will  be  a  periodic  function  in  x  of  period  2n. 

Substituting  the  separable  solution  ansatz  (3.15)  into  the  heat  equation  and  the  bound¬ 
ary  conditions  results  in  the  periodic  eigenvalue  problem 


d2 


v 


dx 2 


+  A  v  =  0, 


v(—  7 r)  =  v(tt) 


v'(—  7 r)  =  t/(t r). 


(4.33) 


As  we  already  noted  in  Section  3.1,  the  eigenvalues  of  this  particular  boundary  value 
problem  are  A n  =  n2,  where  n  —  0,1,2,...  is  a  nonnegative  integer;  the  corresponding 
eigenfunctions  are  the  trigonometric  functions 


xn(x)  =  cosnx,  xn(x)  =  sinnx,  n  —  0,1,2,.... 

Note  that  A0  =  0  is  a  simple  eigenvalue,  with  constant  eigenfunction  cosOx  =  1  —  the 
sine  solution  sin  Ox  =  0  is  trivial  —  while  the  positive  eigenvalues  are,  in  fact,  double,  each 
possessing  two  linearly  independent  eigenfunctions.  The  corresponding  eigensolutions  to 
the  heated  ring  equation  (4.30-31)  are 

un(t,x)  =  e  cos  nx,  un(t,x)  =  e  sin  nx,  n  —  0,1, 2, 3,.... 

The  resulting  infinite  series  solution  is 

00 

n(t,  x)  =  |  a0  +  ( ane~n  1  cosnx  +  bne~n  1  sinnx ),  (+J  (4.34) 

n—  1 
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with  as  yet  unspecified  coefficients  an,6n.  The  initial  conditions  require 


oo 

u( 0,  x)  =  \  a0  +  (an  cosnx  +  bn  sinnx)  =  /(#),  (4.35) 

n  =  1 

which  is  precisely  the  complete  Fourier  series  (3.34)  of  the  initial  temperature  profile  f(x). 
Consequently, 


i  r  i  r 

an  =  —  f(x)cosnxdx:  6n  =  —  /  /(x)sinnxdx,  (4.36) 

^  J-7T  tt  J_n 

are  its  usual  Fourier  coefficients  (3.35). 

As  in  the  Dirichlet  problem,  after  the  initial  instant,  the  high-frequency  terms  in  the 
series  (4.34)  become  extremely  small,  since  e_n  t  <C  1  for  n  0.  Therefore,  as  soon  as 
t  >  0,  the  solution  instantaneously  becomes  smooth,  and  quickly  degenerates  into  what  is 
in  essence  a  finite  sum  over  the  first  few  Fourier  modes.  Moreover,  as  £  ^  oo,  all  of  the 
Fourier  modes  will  decay  to  zero  with  the  exception  of  the  constant  mode,  associated  with 
the  null  eigenvalue  A0  =  0.  Consequently,  the  solution  will  converge,  at  an  exponential 
rate,  to  a  constant-temperature  profile, 


*7 r 


u(t,  x) 


^  2  a0  ~ 


2n 


f(x)  dx . 


-7T 


which  equals  the  average  of  the  initial  temperature  profile.  In  physical  terms,  since  the 
insulation  prevents  any  thermal  energy  from  escaping  the  ring,  it  rapidly  redistributes  itself 
so  that  the  ring  achieves  a  uniform  constant  temperature  —  its  eventual  equilibrium  state. 

Prior  to  attaining  equilibrium,  only  the  very  lowest  frequency  Fourier  modes  will  still 
be  noticeable,  and  so  the  solution  will  asymptotically  look  like 


u(£,  x)  |  a0  +  e  1  (a1  cos  x  +  bx  sin  x)  =  \  a0  +  r1  e  tcos(x  +  51),  (4.37) 

where 

1  r  1  r 

al=r1cos5l  =  —  /  /(x)cosxdx,  61=r1sin51  =  —  /  f(x)sinxdx. 

27r4-TT  27W-7T 

Thus,  for  most  initial  data,  the  solution  approaches  thermal  equilibrium  at  an  exponential 

rate  of  e~t.  The  exceptions  are  when  a±  =  b1  =  0,  for  which  the  rate  of  convergence  is 

_  ,2  , 

even  faster,  namely  at  a  rate  e  ,  where  k  is  the  smallest  integer  such  that  at  least  one 
of  the  kth  order  Fourier  coefficients  ak,bk  is  nonzero. 

In  fact,  once  we  are  convinced  that  the  bar  must  tend  to  thermal  equilibrium  as  t  — >  oo, 
we  can  predict  the  final  temperature  without  knowing  the  explicit  solution  formula.  Our 
derivation  in  Section  4.1  implies  that  the  heat  equation  has  the  form  of  a  conservation  law 
(4.1),  with  the  conserved  density  being  the  temperature  u(t,x).  As  in  (4.2),  the  integrated 
form  of  the  conservation  law  reads 


d 

dt 


*7 r 


n(t,  x)  dx 


—  7 r 


*7 r 


—  7 r 


du 

~dt 


r  o2u 

(t,  x)dx  —  7  /  t— r  (t,  x)  dx 

J  —  7T  dx2 


where  the  flux  terms  cancel  thanks  to  the  periodic  boundary  conditions  (4.31).  Physically, 
any  flux  out  of  one  end  of  the  circular  bar  is  immediately  fed  into  the  other,  abutting  end, 
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and  so  there  is  no  net  loss  of  thermal  energy.  We  conclude  that,  for  the  periodic  boundary 
value  problem,  the  total  thermal  energy 

/7T 

u(t,  x )  dx  =  constant 

-7T 

remains  constant  for  all  time.  (In  contrast,  the  thermal  energy  does  not  remain  constant 
for  the  Dirichlet  boundary  value  problem,  decaying  steadily  to  0  due  to  the  out-flux  of  heat 
through  the  ends  of  the  bar;  see  Exercise  4.1.13  for  further  details.) 

Remark :  More  correctly,  according  to  (4.3),  the  thermal  energy  is  obtained  by  multi¬ 
plying  the  temperature  by  the  product,  a  =  py,  of  the  density  and  the  specific  heat  of  the 
body.  For  the  heat  equation,  both  are  constant,  and  so  the  physical  thermal  energy  equals 
( 7  Eft ).  Mathematically,  we  can  safely  ignore  this  extra  constant  factor,  or,  equivalently, 
work  in  physical  units  in  which  a  =  1.  This  does  not  extend  to  nonuniform  bodies,  whose 

/7T 

c(x)  u(t,  x)  dx ,  and  whose  constancy,  under  suitable 

-7 r 

boundary  conditions,  follows  from  the  conservation-law  form  (4.6)  of  the  linear  diffusion 
equation. 


(4.38) 


In  general,  a  system  is  in  (static)  equilibrium  if  it  remains  unaltered  as  time  progresses. 
Thus,  any  equilibrium  configuration  has  the  form  u  =  u^fx),  and  hence  satisfies  du*  jdt  —  0. 
If,  in  addition,  u*(x)  is  an  equilibrium  solution  to  the  periodic  heat  equation  (4.30-33), 
then  it  must  satisfy 


du *  _  d2u* 

dt  dx2  ’ 


u*(—  7 r)  =  vf  (71 ) , 


(4.39) 


In  other  words,  u *  is  a  solution  to  the  periodic  boundary  value  problem  (4.33)  for  the  null 
eigenvalue  A  =  0.  Thus,  the  null  eigenfunctions  ( including  the  zero  solution )  are  all  the 
possible  equilibrium  solutions.  In  particular,  for  the  periodic  boundary  value  problem,  the 
null  eigenfunctions  are  constant,  and  therefore  solutions  to  the  periodic  heat  equation  will 
tend  to  a  constant  equilibrium  temperature. 

Now,  once  we  know  that  the  solution  tends  to  a  constant,  u(t,  x)  a  as  t  — 00,  then 
its  thermal  energy  tends  to 


/7T  p  7T 

u(t,x)dx  — >  /  adx  =  2ira  as  t — 00. 

-7 r  J  —7 r 


On  the  other  hand,  as  we  just  demonstrated,  the  thermal  energy  is  constant,  so 


/7T 

n(0,  x )  dx 

-7T 


Combining  these  two,  we  conclude  that 


*7T 


f{x)  dx. 


-7T 


/7T 

f(x)  dx  =  27 ra,  and  so  the  equilibrium  temperature  a  = 

-7T 


*7 r 


27 r 


f(x)  dx 


-7 r 


equals  the  average  initial  temperature.  This  reconfirms  our  earlier  result,  but  avoids  having 
to  know  an  explicit  series  solution  formula.  As  a  result,  the  latter  method  can  be  applied 
to  a  much  wider  range  of  situations. 
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Inhomogeneous  Boundary  Conditions 


So  far,  we  have  concentrated  our  attention  on  homogeneous  boundary  conditions.  There  is 
a  simple  trick  that  will  convert  a  boundary  value  problem  with  inhomogeneous  but  constant 
Dirichlet  boundary  conditions, 


du 


d2u 


7 


u(t,  0) 


a. 


u(t,  £)  =  /3. 


t  >  0 


—  ^  ? 


(4.40) 


dt  1  dx 2  ’ 

into  a  homogeneous  Dirichlet  problem.  We  begin  by  solving  for  the  equilibrium  temperature 
profile.  As  in  (4.39),  the  equilibrium  does  not  depend  on  t  and  hence  satisfies  the  boundary 
value  problem 


du * 
~dt 


d2u * 
dx 2  ’ 


u*(0)=a,  u*(£)=t 3. 


Solving  the  ordinary  differential  equation  yields  u*(x)  =  a+6x,  where  the  constants  a,  b  are 
fixed  by  the  boundary  conditions.  We  conclude  that  the  equilibrium  solution  is  a  straight 
line  connecting  the  boundary  values: 


u*(x)  =  a  +  ^  a  x.  (4-41) 

The  difference 

_  j3  —  o 

u(t,  x)  =  u(t,  x)  —  u*(x)  =  u(t,  x)  —  a - - —  x  (4.42) 

*(/ 

measures  the  deviation  of  the  solution  from  equilibrium.  It  clearly  satisfies  the  homoge¬ 
neous  boundary  conditions  at  both  ends: 


u(t,  0)  =  0  =  u(t,  £). 

Moreover,  by  linearity,  since  both  u(t,x)  and  u*(x)  are  solutions  to  the  heat  equation,  so 
is  u(t,x).  The  initial  data  must  be  similarly  adapted: 

_  j3  —  OL  ~ 

n(0,  x)  =  u(t,  x)  —  u*(x)  =  f(x)  —  a - - —  x  =  f(x).  (4.43) 

Solving  the  resulting  homogeneous  initial-boundary  value  problem,  we  write  S(t,  x)  in 
Fourier  series  form  (4.23),  where  the  Fourier  coefficients  are  specified  by  the  modified 

initial  data  f(x)  in  (4.43).  The  solution  to  the  inhomogeneous  boundary  value  problem 
thus  has  the  series  form 


/  \  /3  —  a 

u(t,  x)  =  a  4 - 7 —  x 


£ 


OO  / 

+  52  7  cxp  ( - 

n  —  1  ' 


7n27r2  \  mrx 

■  t  sin 


i2 


e 


(4.44) 


where 


bn  =  i 


0 


j(x)  sm  ax. 


n—  1,2,  3, 


(4.45) 


Since  u(t,  0)  decays  to  zero  at  an  exponential  rate  as  t  -T  oo,  the  actual  temperature  profile 
(4.44)  will  asymptotically  decay  to  the  equilibrium  profile, 


u(t,  x) 


+  ,  ,  (3  -  a 

->  u  (x)  =  a  +  — 7 —  x. 


£ 


at  the  same  exponentially  fast  rate,  governed  by  the  first  eigenvalue  Ax  =  7r2/£2  —  unless 
bx  =  0,  in  which  case  the  decay  rate  is  even  faster. 
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This  method  does  not  work  as  well  when  the  boundary  conditions  are  time-dependent: 

u(t,  0)  =  <a(t),  u(t,£)  =  /3(t). 

Attempting  to  mimic  the  preceding  technique,  we  discover  that  the  deviation* * 

u(t,x)  =  u(t,x)  —  u*(t,x),  where  u*(t,x)  =  a(t)  +  x,  (4.46) 

satishes  the  homogeneous  boundary  conditions,  but  now  solves  an  inhomogeneous  or  forced 
version  of  the  heat  equation: 

||  =  0  +  ft(f,aO,  where  h{t,x)  =  -  ^  (t,x)  =  -  a'(t)  -  W)  x.  (4.47) 

Solution  techniques  for  the  latter  partial  differential  equation  will  be  discussed  in  Section  8.1 
below. 


Robin  Boundary  Conditions 


Consider  a  bar  of  unit  length  and  unit  thermal  diffusivity,  insulated  along  its  length, 
which  has  one  of  its  ends  held  at  0°  and  the  other  put  in  a  heat  bath.  The  resulting 
thermodynamics  are  modeled  by  the  heat  equation  subject  to  Dirichlet  boundary  conditions 
at  x  =  0  and  Robin  boundary  conditions  at  x  =  1: 


du  d2u 
dt  dx 2  ’ 


u(t,  0)  =  0, 


du 

dx 


(£,  1)  +  [3  u(t ,  1)  =  0, 


(4.48) 


where  (3  ^  0  is  a  constant*  that  measures  the  rate  of  transfer  of  thermal  energy,  with  (3  >  0 
when  the  bath  is  cold  and  so  the  energy  is  being  extracted  from  the  bar.  As  before,  the 
general  solution  to  the  resulting  initial-boundary  value  problem  can  be  assembled  from 
the  separable  eigensolutions  based  on  our  usual  exponential  ansatz  u(t,x)  =  e~Xtv(x). 
Substituting  this  expression  into  (4.48),  we  find  that  the  eigenfunction  v{x)  must  satisfy 
the  boundary  value  problem 
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~  dx^  =  'tf(O)  =  0,  v\l)  +  j3v(l)  =  0.  (4.49) 

In  order  to  find  nontrivial  solutions  v{x)  ^  0  to  (4.49),  let  us  first  assume  A  =  ca2  >  0, 
where,  without  loss  of  generality,  uj  >  0.  The  solution  to  the  ordinary  differential  equation 
that  satishes  the  Dirichlet  boundary  condition  at  x  =  0  is  a  constant  multiple  of  v{x)  = 
sincex.  Substituting  this  function  into  the  Robin  boundary  condition  at  x  =  1,  we  find 


cecosce  +  f3  since  =  0,  or,  equivalently,  uj  =  —  /Stance.  (4.50) 

It  is  not  hard  to  see  that  there  is  an  infinite  number  of  real,  positive  solutions  0  <  u1  <  uj2  < 
uj3  <  •  •  •  — oo  to  the  latter  transcendental  equation.  Indeed,  they  can  be  characterized  as 
the  abscissas  uon  >  0  of  the  intersection  points  of  the  graphs  of  the  two  functions  f(ui)  =  ce 


*  In  this  case,  u*(t,x)  is  not  an  equilibrium  solution.  Indeed,  we  do  not  expect  the  bar  to  go 
to  equilibrium  if  the  temperature  of  its  endpoints  is  constantly  changing. 

*  The  case  f3  =  0  reduces  to  the  mixed  boundary  value  problem,  whose  analysis  is  left  to  the 
reader. 
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Figure  4.3.  Eigenvalue  equation  for  Robin  boundary  conditions. 


and  g{uj)  =  —/Stance,  as  shown  in  the  first  plot  in  Figure  4.3.  Each  root  uju  defines  a 
positive  eigenvalue  An  =  ce2  >  0  to  the  boundary  value  problem  (4.49)  and  hence  an 
exponentially  decaying  eigensolution 

un(t,x)  =  e~Xnt  sine cnx  (4.51) 

to  the  Robin  boundary  value  problem  (4.48).  While  there  is  no  explicit  formula,  nu¬ 
merical  approximations  to  the  eigenvalues  are  easily  found  via  a  numerical  root  finder, 
e.g.,  Newton’s  Method,  [24,94].  In  particular,  for  /?  =  1,  the  first  three  eigenvalues  are 
Ai  =  uj\  «  4.1159,  A2  =  ce2  ~  24.1393,  A3  =  ce2  ^  63.6591. 

What  about  a  zero  eigenvalue?  If  A  =  0  in  (4.49),  then  the  solution  to  the  ordinary 
differential  equation  that  satisfies  the  Dirichlet  boundary  condition  is  a  constant  multiple 
of  v(x)  =  x.  This  function  satisfies  the  Robin  boundary  condition  v'(l)  +  p  v(l)  =  0  if  and 
only  if  /?  =  —  1.  In  this  special  configuration,  the  heat  equation  admits  a  time-independent 
eigensolution  u0(t,  x)  =  x  with  eigenvalue  A0  =  0.  Physically,  the  rate  of  transfer  of  thermal 
energy  into  the  bar  through  its  end  in  the  heat  bath  is  exactly  enough  to  cancel  the  heat 
loss  through  the  Dirichlet  end,  resulting  in  a  steady-state  solution.  All  other  eigenmodes 
correspond  to  positive  eigenvalues,  and  hence  are  exponentially  decaying.  The  general 
solution  decays  to  the  steady  state,  which  is  a  constant  multiple  of  the  null  eigensolution: 
u(t,x)  -T  cx  as  t  oo,  at  an  exponential  rate  prescribed,  generically,  by  the  first  positive 
eigenvalue  Ax  >  0. 

However,  in  contrast  to  the  more  common  types  of  boundary  conditions  (Dirichlet, 
Neumann,  mixed,  periodic),  we  cannot  automatically  rule  out  the  existence  of  negative 
eigenvalues  in  the  Robin  case.  Suppose  A  =  —  ce2  <  0  with  uj  >  0.  Now  the  solution  to 
(4.49)  that  satisfies  the  Dirichlet  boundary  condition  at  x  =  0  is  a  constant  multiple  of 
the  hyperbolic  sine  function  v(x)  =  sinheex.  Substituting  this  expression  into  the  Robin 
boundary  condition  at  x  =  1  produces 


cecoshce  +  /3sinhce  =  0,  or,  equivalently,  ce  =  — /3tanhce,  (4.52) 


where 


_  smh  uj  —  e  “ 

t  amice  =  - - —  =  - 

coshce  eu  +  e  u 

is  the  hyperbolic  tangent.  If  j3  >  —1,  there  are  no  solutions  uj  >  0  to  this  transcendental 
equation,  and  in  this  case  all  the  eigenvalues  are  strictly  positive  and  all  solutions  to  the 


(4.53) 
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heat  equation  are  exponentially  decaying.  On  the  other  hand,  if  (3  <  —1,  there  is  a  single 
solution  l d0  >  0,  which  produces  a  single  negative  eigenvalue  A0  =  —  cCq.  Representative 
graphs  illustrating  the  two  possibilities  appear  in  Figure  4.3;  in  the  first,  the  graph  of 
=  u  does  not  intersect  the  graph  of  g(u)  =  \  tanhca  when  uo  >  0,  whereas  it  intersects 
the  graph  of  g(ui)  =  2tanhce  at  a  single  point,  with  abscissa  ca0  1.9150,  producing  the 
negative  eigenvalue  A0  ^  —cJq  ^  —3.6673.  Thus,  when  j3  <  —  1,  there  is,  in  addition  to  all 
the  exponentially  decaying  eigenmodes  associated  with  the  positive  eigenvalues,  a  single 
unstable  exponentially  growing  eigenmode 

uQ(t,x)  =  eXot  sinhce0x.  (4.54) 

Physically,  /?  <  —  1  implies  that  thermal  energy  is  entering  the  Robin  end  of  the  bar  at  a 
faster  rate  than  can  be  removed  through  the  Dirichlet  end,  and  hence  the  bar  experiences 
an  exponential  increase  in  its  overall  temperature. 

Remark :  Even  though  some  Robin  boundary  conditions  admit  exponentially  growing 
solutions,  and  hence  lead  to  unstable  dynamics,  the  initial-boundary  value  problem  remains 
well-posed  because  the  solution  exists  and  is  uniquely  determined  by  the  initial  data,  and, 
moreover,  small  changes  in  the  initial  conditions  induce  relatively  small  changes  in  the 
resulting  solution  on  bounded  time  intervals. 


The  Root  Cellar  Problem 

As  a  final  example,  we  discuss  a  problem  that  involves  analysis  of  the  heat  equation  on 
a  semi-infinite  interval.  The  question  is  this:  how  deep  should  you  dig  a  root  cellar?  In 
the  prerefrigeration  era,  a  root  cellar  was  used  to  keep  food  cool  in  the  summer,  but  not 
freeze  in  the  winter.  We  assume  that  the  temperature  inside  the  Earth  depends  only  on 
the  depth  and  the  time  of  year.  Let  u(t,  x )  denote  the  deviation  in  the  temperature  from 
its  annual  mean  at  depth  x  >  0  and  time  t.  We  shall  assume  that  the  temperature  at  the 
Earth’s  surface,  x  =  0,  fluctuates  in  a  periodic  manner;  specifically,  we  set 

u(t,  0)  =  a  cos  uj  t, 

where  the  oscillatory  frequency 

2  7T  _ i-7  _ -| 

ce  = - - - =  2.0  x  10  7sec 

365.25  days 

refers  to  yearly  temperature  variations.  In  this  model,  we  shall  ignore  daily  temperature 
fluctuations,  since  their  effect  is  not  significant  below  a  very  thin  surface  layer.  At  large 
depths  the  temperature  is  assumed  to  be  unvarying: 

u(t,x)  — >  0  as  x  — >  oo, 

where  0  refers  to  the  mean  temperature. 

Thus,  we  must  solve  the  heat  equation  on  a  semi-infinite  bar  0  <  x  <  oo,  with  time- 
dependent  boundary  conditions  (4.55,  57)  at  the  ends.  The  analysis  will  be  simplified  a 
little  if  we  replace  the  cosine  by  a  complex  exponential,  and  so  we  look  for  a  complex 
solution  with  boundary  conditions 


(4.57) 


(4.56) 


(4.55) 


lim  u(t,  x)  =  0. 

x  — >  oo 


u(t,  0)  =  a  e 


? 


(4.58) 
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Let  us  try  a  separable  solution  of  the  form 

u(t,  x)  =  v(x)  e1  UJt.  (4.59) 

Substituting  this  expression  into  the  heat  equation  ut—  7  uxx  leads  to 

ice  v{x)  e1UJt  =  ^v"  {x)  elujt. 

Canceling  the  common  exponential  factors,  we  conclude  that  v{x)  should  solve  the  bound¬ 
ary  value  problem 

r)v"(x)=\L eu,  u(0)  =  a,  lim  v(x)  =  0. 

x  — y  00 

The  solutions  to  the  ordinary  differential  equation  are 

Vl(x)  =  =  eW/(27)  (1+  i)x ^  D2(X)  =  e-\/i =e-W(27)(l+iK 

The  first  solution  is  exponentially  growing  as  x  -T  00,  and  so  not  germane  to  our  prob¬ 
lem.  The  solution  to  the  boundary  value  problem  must  therefore  be  a  multiple  of  the 
exponentially  decaying  solution: 


v{x)  —  ae  (1+1) x . 


Substituting  back  into  (4.59),  we  find  the  (complex)  solution  to  the  root  cellar  problem  to 
be  _  _ 


u(t,x)  =  ae“a:W/(27)  ei(u,t-vV(27) *). 

The  corresponding  real  solution  is  obtained  by  taking  the  real  part, 


(4.60) 


u(t,  x)  =  a  e  x  cos  ^c ut  —  x 

The  first  factor  in  (4.61)  is  exponentially  decaying  as  a  function  of  the  depth.  Thus,  the 
further  underground  one  is,  the  less  noticeable  is  the  effect  of  the  surface  temperature 
fluctuations.  The  second  factor  is  periodic  in  time,  with  the  same  annual  frequency  u.  The 
interesting  feature  is  that  the  temperature  variations  (4.61)  are  typically  out  of  phase  with 
respect  to  the  surface  temperature  fluctuations,  having  an  overall  phase  lag  of 


(4.61) 


that  depends  linearly  on  the  depth  x.  In  particular,  a  cellar  built  at  a  depth  where  5  is  an 
odd  multiple  of  n  will  be  completely  out  of  phase,  being  hottest  in  the  winter,  and  coldest 
in  the  summer.  Thus,  the  (shallowest)  ideal  depth  at  which  to  build  a  root  cellar  would 
take  5  =  tt,  corresponding  to  a  depth  of 


x  =  7T 


27 


UJ 


(4.62) 


For  typical  soils  in  the  Earth,  7  10  6  meters2  sec  x,  and  so,  with  uj  given  by  (4.56) 

x  9.9  meters.  However,  at  this  depth,  the  relative  amplitude  of  the  oscillations  is 


e-z  —  e-7r  =  04, 

and  hence  there  is  only  a  4%  temperature  fluctuation.  In  Minneapolis,  the  temperature 
varies,  roughly,  from  —  40°C  to  +40°C,  and  hence  our  10-meter-deep  root  cellar  would 
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experience  only  a  3.2°C  annual  temperature  deviation  from  the  winter,  when  it  is  the 
warmest,  to  the  summer,  when  it  is  the  coldest.  Building  the  cellar  twice  as  deep  would 
lead  to  a  temperature  fluctuation  of  .2%,  now  in  phase  with  the  surface  variations,  which 
means  that  the  cellar  would  be,  for  all  practical  purposes,  at  constant  temperature  year 
round. 


Exercises 


4.1.1.  Suppose  the  ends  of  a  bar  of  length  1  and  thermal  diffusivity  7  =  1  are  held  fixed  at 
respective  temperatures  0°  and  10°.  (a)  Determine  the  equilibrium  temperature  profile. 

( b )  Determine  the  rate  at  which  the  equilibrium  temperature  profile  is  approached. 

(c)  What  does  the  temperature  profile  look  like  as  it  nears  equilibrium? 


4.1.2.  A  uniform  insulated  bar  1  meter  long  is  stored  at  room  temperature  of  20°  Celsius.  An 
experimenter  places  one  end  of  the  bar  in  boiling  water  and  the  other  end  in  ice  water. 

(a)  Set  up  an  initial-boundary  value  problem  that  models  the  temperature  in  the  bar. 

(b)  Find  the  equilibrium  temperature  distribution. 

(c)  Discuss  how  your  answer  depends  on  the  material  properties  of  the  bar. 

4.1.3.  Consider  the  initial-boundary  value  problem 

du  d2u  a(t,  0)  =  0  =  u(t,  10),  t  >  0, 

dt  dx 2  ’  a(0,  x)  =  /(x),  0  <  x  <  10, 

for  the  heat  equation  where  the  initial  data  has  the  following  form: 

1  <  x  <  2, 

2  <  x  <  3, 

3  <  x  <  4, 

4  <  x  <  5, 
otherwise. 


Discuss  what  happens  to  the  solution  as  t  increases.  You  do  not  need  to  write  down  an  ex¬ 
plicit  formula,  but  for  full  credit  you  must  explain  (sketches  can  help)  at  least  three  or  four 
interesting  things  that  happen  to  the  solution  as  time  progresses. 


4.1.4.  Find  a  series  solution  to  the  initial-boundary  value  problem  for  the  heat  equation 

u+  =  for  0  <  x  <  1  when  one  the  end  of  the  bar  is  held  at  0°  and  the  other  is  insulated. 

i/  x  x 

Discuss  the  asymptotic  behavior  of  the  solution  as  t  00. 

4.1.5.  Answer  Exercise  4.1.4  when  both  ends  of  the  bar  are  insulated. 

4.1.6.  A  metal  bar,  of  length  £  =  1  meter  and  thermal  diffusivity  7  =  2,  is  taken  out  of  a  100° 
oven  and  then  fully  insulated  except  for  one  end,  which  is  fixed  to  a  large  ice  cube  at  0°. 

(a)  Write  down  an  initial-boundary  value  problem  that  describes  the  temperature  u(t,  x)  of 
the  bar  at  all  subsequent  times,  (b)  Write  a  series  formula  for  the  temperature  distribu¬ 
tion  u(t,  x)  at  time  t  >  0.  (c)  What  is  the  equilibrium  temperature  distribution  in  the  bar, 

i.e.,  for  t  0?  How  fast  does  the  solution  go  to  equilibrium?  (d)  Just  before  the  tempera¬ 
ture  distribution  reaches  equilibrium,  what  does  it  look  like?  Sketch  a  picture  and  discuss. 


4.1.7.  A  metal  bar  of  length  £  =  1  and  thermal  diffusivity  7  =  1  is  fully  insulated,  including  its 

x,  0  <  x  <  2, 


ends.  Suppose  the  initial  temperature  distribution  is  u{ 0,  x)  = 


1  —  x, 


2  <  x  <  1. 


(a)  Use  Fourier  series  to  write  down  the  temperature  distribution  at  time  t  >  0. 
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(b)  What  is  the  equilibrium  temperature  distribution  in  the  bar,  i.e.,  for  t  0? 

(c)  How  fast  does  the  solution  go  to  equilibrium?  (d)  Just  before  the  temperature 
distribution  reaches  equilibrium,  what  does  it  look  like?  Sketch  a  picture  and  discuss. 


4.1.8.  (a)  Find  the  series  solution  to  the  heat  equation  ut  =  uxx  on  —  2  <  x  <  2,  t  >  0,  when 
subject  to  the  Dirichlet  boundary  conditions  u(t,  —2)  =  u(t,  2)  =  0  and  the  initial  condition 


u( 0,  x)  = 


x, 


X 


<  1, 


(b)  Sketch  a  graph  of  the  solution  at  some  representative 


0,  otherwise. 

times,  (c)  At  what  rate  does  the  temperature  approach  thermal  equilibrium? 


4.1.9.  Solve  the  heat  equation  when  the  right-hand  end  of  a  bar  of  unit  length  is  held  at  a  fixed 
constant  temperature  a  while  the  left-hand  end  is  insulated.  Discuss  the  asymptotic 
behavior  of  the  solution. 


4.1.10.  For  each  of  the  following  initial  temperature  distributions,  (4)  write  out  the  Fourier 
series  solution  to  the  heated  ring  (4.30-32),  and  (ii)  find  the  resulting  equilibrium 


Q 

temperature  as  t  oo:  (a)  cosx,  (b)  sin  x,  (c) 


x 


(d) 


1,  —7 T  <  X  <  0. 

0,  0  <  X  <  7T. 


0  4.1.11.  Suppose  that  the  temperature  u(t,x)  of  a  homogeneous  bar  satisfies  the  heat  equation. 
Show  that  the  associated  heat  flux  w{t,x)  is  also  a  solution  to  the  same  heat  equation. 

0  4.1.12.  Show  that  the  time  derivative  v  =  ut  of  any  solution  to  the  heat  equation  is  also  a 

solution.  If  u{t,x)  satisfies  the  initial  condition  u(0,  x)  =  /(#),  what  initial  condition  does 
v(t,x)  inherit? 

r £ 

§  4.1.13.  Explain  why  the  thermal  energy  Eft)  =  J u{t,x)  dx  is  not  constant  for  the  Dirichlet 

initial-boundary  value  problem  for  the  heat  equation  on  the  interval  [0,Q. 

r i 

0  4.1.14.  (a)  Show  that  the  thermal  energy  Eft)  =  /  uf,x)  dx  is  constant  for  the  Neumann 

J  o 

boundary  value  problem  on  the  interval  [ 0,  T] .  (b)  Use  part  (a)  to  prove  that  the  constant 

equilibrium  solution  for  the  homogeneous  Neumann  boundary  value  problem  is  equal  to  the 
mean  initial  temperature  u{ 0,  x). 

4.1.15.  Let  u(t,x)  be  any  nonconstant  solution  to  the  periodic  heat  equation  (4.30-31).  Prove 
that  the  squared  L  norm  of  the  solution,  Nf)  =  /  uf,  x)  dx ,  is  a  strictly  decreasing 

J  —7 T 

function  of  t.  Remark:  Interestingly,  comparing  this  result  with  formula  (4.38),  we  find 
that,  for  the  periodic  boundary  value  problem,  the  integral  of  u  is  constant,  but  the 
integral  of  u 2  is  strictly  decreasing.  How  is  this  possible? 


T  4.1.16.  The  cable  equation  vt  =  ^vxx  —  av,  with  7,  a  >0,  also  known  as  the  lossy  heat  equa¬ 
tion, was  derived  by  the  nineteenth-century  Scottish  physicist  William  Thomson  to  model 
propagation  of  signals  in  a  transatlantic  cable.  Later,  in  honor  of  his  work  on 
thermodynamics,  including  determining  the  value  of  absolute  zero  temperature,  he  was 
named  Lord  Kelvin  by  Queen  Victoria.  The  cable  equation  was  later  used  to  model  the 
electrical  activity  of  neurons,  (a)  Show  that  the  general  solution  to  the  cable  equation  is 

given  by  v(t,  x)  =  e~a  1  u(t ,  x),  where  u(t ,  x)  solves  the  heat  equation  ut  =  7 uxx. 

(b)  Find  a  Fourier  series  solution  to  the  Dirichlet  initial-boundary  value  problem 

vt=jvxx  —  av,  u(0 ,x)  =  /(#),  v(t,  0)  =  0  =  v(t,  1),  0  <  x  <  1,  t  >  0. 

Does  your  solution  approach  an  equilibrium  value?  If  so,  how  fast? 

(c)  Answer  part  (b)  for  the  Neumann  problem 

vt~^vxx~avi  v(®ix)  =  f(x)i  ux(£,  0)  =  0  =  vx(t,  1),  0  <  x  <  1,  t  >  0. 


^  4.1.17.  The  convection- diffusion  equation  ut  +  cux  =  7 uxx  is  a  simple  model  for  the  diffusion 
of  a  pollutant  in  a  fluid  flow  moving  with  constant  speed  c.  Show  that  v(t,  x)  =  u{t ,  x  +  ct) 
solves  the  heat  equation.  What  is  the  physical  interpretation  of  this  change  of  variables? 

4.1.18.  Combine  Exercises  4.1.16-17  to  solve  the  lossy  convection- diffusion  equation 

ut  =  7  uxx  +  cux  —  au. 
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0  4.1.19.  Let  7  >  0  and  A  <  0.  (a)  Find  all  solutions  to  the  differential  equation  'yv"-\-Xv  =  0. 
(b)  Prove  that  the  only  solution  that  satisfies  the  boundary  conditions  u(0)  =  0,  v(£)  =  0, 
is  the  zero  solution  v(x)  =  0. 

0  4.1.20.  Answer  Exercise  4.1.19  when  A  is  a  non- real  complex  number. 


4.2  The  Wave  Equation 


Let  us  return  to  the  one-dimensional  wave  equation 


d2u 


d2u 


c 


dt 2  ~  dx2  ’  ^4‘63^ 

with  constant  wave  speed  c  >  0,  used  to  model  the  vibrations  of  bars  and  strings.  In  Chap¬ 
ter  2,  we  learned  how  to  explicitly  solve  the  wave  equation  by  the  method  of  d’Alembert. 
Unfortunately,  d’Alembert’s  approach  does  not  extend  to  other  equations  of  interest  to  us, 
and  so  alternative  solution  techniques,  particularly  those  based  on  Fourier  methods,  are 
worth  developing.  Indeed,  the  resulting  series  solutions  provide  valuable  insight  into  wave 
dynamics  on  bounded  intervals. 


Separation  of  Variables  and  Fourier  Series  Solutions 


One  of  the  oldest  —  and  still  one  of  the  most  widely  used  —  techniques  for  constructing 
explicit  analytic  solutions  to  a  wide  range  of  linear  partial  differential  equations  is  the 
method  of  separation  of  variables.  We  have,  in  fact,  already  employed  a  simplified  version 
of  the  method  when  constructing  each  eigensolution  to  the  heat  equation  as  an  exponential 
function  of  t  times  a  function  of  x.  In  general,  the  separation  of  variables  method  seeks 
solutions  to  the  partial  differential  equation  that  can  be  written  as  the  product  of  functions 
of  the  individual  independent  variables.  For  the  wave  equation,  we  seek  solutions 


u(£,  x)  =  w(t)  v{x)  (4.64) 

that  can  be  written  as  the  product  of  a  function  of  t  alone  and  a  function  of  x  alone. 
When  the  method  succeeds  (which  is  not  guaranteed  in  advance),  both  factors  are  found 
as  solutions  to  certain  ordinary  differential  equations. 

Let  us  see  whether  such  an  expression  can  possibly  solve  the  wave  equation.  First  of 
all, 

d2  vl  d2  u 

—  =w»{t)v{x),  —=w[t)v"(x), 

where  the  primes  indicate  ordinary  derivatives.  Substituting  these  expressions  into  the 
wave  equation  (4.63),  we  obtain 

w"(t)  v(x)  =  c 2  w(t)  v"(x). 

Dividing  both  sides  by  w{t)  v(x)  (which  we  assume  is  not  identically  zero,  since  otherwise, 
the  solution  would  be  trivial)  yields 
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which  effectively  “separates”  the  t  and  x  variables  on  each  side  of  the  equation,  whence 
the  name  “separation  of  variables” . 

Now,  how  could  a  function  of  t  alone  be  equal  to  a  function  of  x  alone?  A  moment’s 
reflection  should  convince  the  reader  that  this  can  happen  if  and  only  if  the  two  functions 
are  constant,"*'  so 


w"{t)  =  c2  v"(x) 
w(t)  v(x) 


(4.65) 


where  we  use  A  to  indicate  the  common  separation  constant.  Thus,  the  individual  factors 
w(t)  and  v(x)  must  satisfy  ordinary  differential  equations 


d2w  cPv  A 

— — -Aw  =  0,  —z  -v  =  0, 

dtz  dxz  cz 

as  promised.  We  already  know  how  to  solve  both  of  these  ordinary  differential  equations 
by  elementary  techniques.  There  are  three  different  cases,  depending  on  the  sign  of  the 
separation  constant  A.  As  a  result,  each  value  of  A  leads  to  four  independent  separable 
solutions  to  the  wave  equation,  as  listed  in  the  accompanying  table. 


Separable  Solutions  to  the  Wave  Equation 


So  far,  we  have  not  taken  the  boundary  conditions  into  account.  Consider  first  the 
case  of  a  string  of  length  £  with  two  fixed  ends,  and  thus  subject  to  homogeneous  Dirichlet 
boundary  conditions 

u(t,  0)  =  0  =  u(t,  £). 

Substituting  the  separable  ansatz  (4.65),  we  find  that  v{x)  must  satisfy 

y4--4v  =  0,  v(0)  =  0  =  v(£).  (4.66) 

dxz  cz 

The  complete  system  of  (nontrivial)  solutions  to  this  boundary  value  problem  were  found 
in  (4.21): 


vn(x)  =  sin 


nrrx 


£ 


A  =  - 


nirc 


n 


£ 


n—  1,2,  3, 


Technical  detail :  one  should  assume  that  the  underlying  domain  is  connected  for  this  to  be 
valid  as  stated.  In  practice,  this  technicality  can  be  safely  ignored. 


142 


4  Separation  of  Variables 


Hence,  according  to  the  table,  the  corresponding  separable  solutions  are 


nnct  nnx 
un(t ,  x)  =  cos  — 7 —  sin 


e 


e 


_  nnct  nnx 

un(t,  x)  =  sm  — 7 —  sm 


£ 


£ 


(4.67) 


We  will  now  employ  these  solutions  to  construct  a  candidate  series  solution  to  the  wave 
equation  subject  to  the  prescribed  boundary  conditions: 


oo 


U 


(t,  x)  =  ^2 


n  =  1  L 


nnct  nnx  nnct  nnx 

b „  cos  — 7 —  sm  — 7 - b  gL  sm  — 7 —  sm 


n 


t 


t 


n 


£ 


£ 


(4.68) 


The  solution  is  thus  a  linear  combination  of  the  natural  Fourier  modes  vibrating  with 
frequencies 

nnc  nn  fn 

n=  1,2,3,...  ,  (4.69) 


(x  =  — 

n  £ 


£ 


P 


where  the  second  expression  follows  from  (2.66).  Observe  that,  the  longer  the  length  £ 
of  the  string,  or  the  higher  its  density  p,  the  slower  the  vibrations,  whereas  increasing  its 
stiffness  or  tension  n  speeds  them  up  —  in  exact  accordance  with  our  physical  intuition. 

The  Fourier  coefficients  bn  and  dn  in  (4.68)  will  be  uniquely  determined  by  the  initial 
conditions 

Ou 

u( 0,  x)  =  /(x),  —  (0,  x)  =  p(x),  0  <  x  <  £. 

Differentiating  the  series  term  by  term,  we  discover  that  we  must  represent  the  initial 
displacement  and  velocity  as  Fourier  sine  series 


00 


u 


(0,  x)  =  Y^  K si 


nnx 

sm  — 7 —  =  j(x) 


n—  1 


£ 


du 

dt 


OO 


(0,x)  =  dn 


nnc  nnx 
sm 


n—  1 


£ 


£ 


g(x) 


Therefore. 


=  J  J  f(x)  sin  dx. 


n 


n  —  1,  2,  3, . . . 


(4.70) 


are  the  Fourier  sine  coefficients  (3.85)  of  the  initial  displacement  /(#),  while 


dn  = 


f  x  .  nnx 
gix)  sm  — - —  ax, 
nnc  i0  £ 


n  —  1,2,  3, 


(4.71) 


are  rescaled  versions  of  the  Fourier  sine  coefficients  of  the  initial  velocity  g(x) 


Example  4.3.  A  string  of  unit  length  fixed  at  both  ends  is  held  taut  at  its  center 
and  then  released.  Our  task  is  to  describe  the  ensuing  vibrations.  Let  us  assume  that  the 
physical  units  are  chosen  so  that  c2  =  1,  and  so  we  are  asked  to  solve  the  initial-boundary 
value  problem 


^XX") 


a(0,x)  =  /(x),  ut( 0,x)  =  0,  a(t,  0)  =  a(t,  1)  =  0. 


(4.72) 


To  be  specific,  we  assume  that  the  center  of  the  string  has  been  moved  by  half  a  unit,  and 
so  the  initial  displacement  is 


/O) 


x, 

1  —  X, 


0  <  x  <  |, 

7)  <  X  <  1. 
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Figure  4.4.  Plucked  string  solution  of  the  wave  equation.  (+J 


The  vibrational  frequencies  un  =  nn  are  the  integral  multiples  of  i r,  and  so  the  natural 
modes  of  vibration  are 

cosn7rt  sin  mr  x  and  sinn7rt  sin  tittx  for  n  =  l,2, ...  . 
Consequently,  the  general  solution  to  the  boundary  value  problem  is 

oo 

u(t,x)  =  (6ncosn7rt  sinn7rx  +  dn  sinn7rt  sinn7rx), 
n—  1 


where 

bn  =  2  /  f(x)  sinriTrxdx 

Jo 


r  1/2 

4  /  x  sinn7rx  dx  = 
Jo 


4  (—  l)fe 
(2k  +  l)27r2  ’ 


0 


9 


n  =  2k  +  1, 
n  =  2fc, 


while  dn  =  0.  Therefore,  the  solution  is  the  Fourier  sine  series 


u(£,  x) 


A;  =  G 


cos(2 k  +  1) 7rt  sin(2/c  +  1) 7rx 

(2fc  +  l)2 


(4.73) 


whose  profile  is  depicted  in  Figure  4.4.  At  time  t  —  1,  the  original  displacement  is  re¬ 
produced  exactly,  but  upside  down.  The  subsequent  dynamics  proceeds  as  before,  but  in 
mirror-image  form.  The  original  displacement  reappears  at  time  t  —  2,  after  which  time 
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the  motion  is  periodically  repeated.  Interestingly,  at  times  tk  =  .5, 1.5,  2.5, . . .  ,  the  dis¬ 
placement  is  identically  zero,  u(tk,x)  =  0,  although  the  velocity  is  not,  ut(tk,x )  ^  0.  The 
solution  appears  to  be  piecewise  affine,  i.e.,  its  graph  is  a  collection  of  straight  lines.  This 
can,  in  fact,  be  proved  as  a  consequence  of  the  d’Alembert  formula;  see  Exercise  4.2.13. 
Observe  that,  unlike  the  heat  equation,  the  wave  equation  does  not  smooth  out  discontinu¬ 
ities  and  corners  in  the  initial  data.  And,  although  we  will  loosely  refer  to  such  piecewise 
C2  functions  as  “solutions”,  they  are  not,  in  fact,  classical  solutions.  (Their  status  as  weak 
solutions,  though,  can  be  established  using  the  methods  of  Section  10.4.) 

While  the  series  form  (4.68)  of  the  solution  is  perhaps  less  satisfying  than  a  d’Alembert- 
style  formula,  we  can  still  use  it  to  deduce  important  qualitative  properties.  First  of  all, 
since  each  term  is  periodic  in  t  with  period  2  £/c,  the  entire  solution  is  time  periodic  with 
that  period:  u(t  +  2 £/c,x)  =  u(t,x).  In  fact,  after  half  a  period,  the  solution  reduces  to 


£ 

u  i  -  ,  x 


In  general, 


oo 


F  (_1)n6» sin 


nirx 


oo 


n—  1 


l 


-E‘ 


.  nn(£  —  x) 


n  —  1 


n  Sm  £ 


—  u(  0,  i  —  x)  —  —  f(£  —  x) 


u  (  t  +  ^  ,  x  )  =  —  u(t,  £  —  x), 


,  2£  .  .  . 
u  (  t  +  —  ,  X  )  =  u(t,  X) 


(4.74) 


Therefore,  the  initial  wave  form  is  reproduced,  first  as  an  upside  down  mirror  image  of 
itself  at  time  t  =  £/c,  and  then  in  its  original  form  at  time  t  =  2  £/c.  This  has  the  impor¬ 
tant  consequence  that  vibrations  of  (homogeneous)  one-dimensional  media  are  inherently 
periodic,  because  the  fundamental  frequencies  (4.69)  are  all  integer  multiples  of  the  lowest 
one:  uon  =  nuo1. 


Remark :  The  immediately  preceding  remark  has  important  musical  consequences.  To 
the  human  ear,  sonic  vibrations  that  are  integral  multiples  of  a  single  frequency,  and  thus 
periodic  in  time,  sound  harmonious,  whereas  those  with  irrationally  related  frequencies, 
and  hence  experiencing  aperiodic  vibrations,  sound  dissonant.  This  is  why  most  tonal 
instruments  rely  on  vibrations  in  one  dimension,  be  it  a  violin  or  piano  string,  a  column 
of  air  in  a  wind  instrument  (flute,  clarinet,  trumpet,  or  saxophone),  a  xylophone  bar,  or 
a  triangle.  On  the  other  hand,  most  percussion  instruments  rely  on  the  vibrations  of  two- 
dimensional  media,  e.g.,  drums  and  cymbals,  or  three-dimensional  solid  bodies,  e.g.,  blocks. 
As  we  shall  see  in  Chapters  11  and  12,  the  frequency  ratios  of  the  latter  are  irrationally 
related,  and  hence  their  motion  is  only  quasiperiodic,  as  in  Example  2.20.  For  some  reason, 
our  appreciation  of  music  is  psychologically  attuned  to  the  differences  between  rationally 
related/periodic  and  irrationally  related/quasiperiodic  vibrations,  [105  . 


Consider  next  a  string  with  both  ends  left  free,  and  so  subject  to  the  Neumann  bound¬ 
ary  conditions 

|(t,0)=0=| H(M).  (4. 

The  solutions  of  (4.66)  satisfying  u'(0)  =  0  =  v'(£)  are  now 

/  \  nnx  .  nirc 

vn[x)  =  cos  — —  with  ujn  —  —j—i  n  —  0, 1,  2,  3, . . .  . 

V  Ks 

The  resulting  solution  takes  the  form  of  a  Fourier  cosine  series 


OO  / 

u(t, x)  =  a0 + c0t  +  y;  ( 

n  —  1  ^ 


nnct  nnx  nnct  nnx 

a _  cos  — 7 —  cos  — 7 b  sm  — 7 —  cos 


n 


i 


i 


n 


£ 


l 


(4.76) 
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The  first  two  terms  come  from  the  null  eigenfunction  vQ(x)  =  1  with  uj0  =  0.  The  string 
vibrates  with  the  same  fundamental  frequencies  (4.69)  as  in  the  fixed-end  case,  but  there 
is  now  an  additional  unstable  mode  c0t  that  is  no  longer  periodic,  but  grows  linearly  in 
time.  In  general,  the  presence  of  null  eigenfunctions  implies  that  the  wave  equation  admits 
unstable  modes. 

Substituting  (4.76)  into  the  initial  conditions 


u(0, x)  =  f(x),  —  (0, x)  =  g(x),  0  <  x  <  l, 

we  find  that  the  Fourier  coefficients  are  prescribed,  as  before,  by  the  initial  displacement 
and  velocity: 


an  =  -£ 


0 


n  nx 

j(x)  cos  ax, 


c  — 

n 


nnc 


o 


n  TTX 

g{x)  cos  dx. 


in  —  1,  2,  3, ... 


The  order-zero  coefficients^ 


a 


1 

~t 


fi 

(  f(x)  dx , 
o 


c 


0 


1 


fi 

f  g(x)  dx , 
o 


are  equal  to  the  average  initial  displacement  and  average  initial  velocity  of  the  string.  In 
particular,  when  c0  =  0,  there  is  no  net  initial  velocity,  and  the  unstable  mode  is  not 
excited.  In  this  case,  the  solution  is  time-periodic,  oscillating  around  the  position  given  by 
the  average  initial  displacement.  On  the  other  hand,  if  c0  ^  0,  the  string  will  move  off  with 
constant  average  speed  c0,  all  the  while  vibrating  at  the  same  fundamental  frequencies. 

Similar  considerations  apply  to  the  periodic  boundary  value  problem  for  the  wave 
equation  on  a  circular  ring.  The  details  are  left  as  Exercise  4.2.6  for  the  reader. 


Exercises 


4.2.1.  In  music,  an  octave  corresponds  to  doubling  the  frequency  of  the  sound  waves.  On  my 
piano,  the  middle  C  string  has  length  .7  meter,  while  the  string  for  the  C  an  octave  higher 
has  length  .6  meter.  Assuming  that  they  have  the  same  density,  how  much  tighter  does  the 
shorter  string  need  to  be  tuned? 

4.2.2.  How  much  longer  would  a  piano  string  have  to  be  to  make  the  same  sound  when  it  is 
pulled  twice  as  tight? 

4.2.3.  Write  down  the  solutions  to  the  following  initial-boundary  value  problems  for  the  wave 
equation  in  the  form  of  a  Fourier  series: 

(a)  uu  =  uxx,  u(t,  0)  =  u(t,  7r)  =  0,  u{ 0,x)  =  l,  ut{ 0,  x)  =  0; 

(b)  utt  =  2uxx,  u(t,  0)  =  u(t,  7r)  =  0,  u(0,  x)  =  0,  ut{ 0,x)  =  l; 

(c)  utt  =  3uxx,  u(t,  0)  =  u(t,  7r)  =  0,  u{ 0,  x)  =  sin  r,  ut{ 0,  x)  =  0; 

(d)  utt  =  4uxx,  u(t,  0)  =  u(t,  1)  =  0,  u(0,x)  =  x,  ut(0,x)  =  —  x; 

(e)  utt—uxx ,  u(t,  0)  =  ux(t,  l)  =  0,  u{  0,x)  =  l,  ut{  0,  x)  =  0; 

(f)  utt  =  2uxx,  ux(t,  0)  =  ux(t,  2tt)  =  0,  u(0,x)  =  —  1,  ut{ 0,x)  =  l; 

(g)  utt=uxx,  ux(t,  0)  =  ux(t,  1)  =  0,  u(0,  x)  =  x(l  —  x),  ut(0,x)  =  0. 


^  Note  that  we  have  not  included  the  usual  \  factor  in  the  constant  terms  in  the  Fourier  series 


(4.76). 
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4.2.4.  Find  all  separable  solutions  to  the  wave  equation  utt  =  uxx  on  the  interval  0  <  x  <  tt 
subject  to  (a)  mixed  boundary  conditions  u(t,  0)  =  0,  ux(t,7r)  =  0; 

(b)  Neumann  boundary  conditions  ux(t,  0)  =  0,  ux(t,  tt)  =  0. 

4.2.5.  (a)  Under  what  conditions  is  the  solution  to  the  Neumann  boundary  value  problem  (4.75) 
a  periodic  function  of  tl  What  is  the  period?  (b)  Establish  explicit  periodicity  formulas  of 
the  form  (4.74).  (c)  Under  what  conditions  is  the  velocity  du/dt  periodic  in  tl 

V  4.2.6.  (a)  Formulate  the  periodic  initial-boundary  value  problem  for  the  wave  equation  on  the 

interval  —  tt  <  x  <  tt,  modeling  the  vibrations  of  a  circular  ring,  (b)  Write  out  a  formula  for 
the  solution  to  your  problem  in  the  form  of  a  Fourier  series,  (c)  Is  the  solution  a  periodic 
function  of  tl  If  so,  what  is  the  period?  (d)  Suppose  the  initial  displacement  coincides  with 
that  in  Figure  4.6,  while  the  initial  velocity  is  zero.  Describe  what  happens  to  the  solution 
as  time  evolves. 


4.2.7.  Show  that  the  time  derivative,  v  =  du/dt ,  of  any  solution  to  the  wave  equation  is  also  a 
solution.  If  you  know  the  initial  conditions  of  u,  what  initial  conditions  does  v  satisfy? 

4.2.8.  Find  all  the  separable  real  solutions  to  the  wave  equation  subject  to  a  restoring  force: 
utt  =  uxx  —  u.  Discuss  their  long-term  behavior. 

V  4.2.9.  Let  a,c  >  0  be  positive  constants.  The  telegrapher’s  equation  utt  +  aut  =  c  uxx  repre¬ 
sents  a  damped  version  of  the  wave  equation.  Consider  the  Dirichlet  boundary  value  prob¬ 
lem  u(t,  0)  =  u(t,  1)  =  0,  on  the  interval  0  <  x  <  1,  with  initial  conditions  u( 0,x)  =  /(#), 
ut{ 0,  x)  =  0.  (a)  Find  all  separable  solutions  to  the  telegrapher’s  equation  that  satisfy  the 
boundary  conditions,  (b)  Write  down  a  series  solution  for  the  initial  boundary  value  prob¬ 
lem.  (c)  Discuss  the  long  term  behavior  of  your  solution,  (d)  State  a  criterion  that  distin¬ 
guishes  overdamped  from  underdamped  versions  of  the  equation. 

4.2.10.  The  fourth-order  partial  differential  equation  utt  =  —uxxxx  is  a  simple  model  for  a 
vibrating  elastic  beam,  (a)  Find  all  separable  real  solutions  to  the  beam  equation. 

(b)  Show  that  any  C4  (complex)  solution  to  the  Schrodinger  equation  i  ut  =  uxx  solves  the 
beam  equation. 


4.2.11.  The  initial-boundary  value  problem 

u(t,  0)  =  uxx(t,  0)  =  u(t,  1)  =  Uxx(t,  1) 
u(0,x)  =  f(x),  ut{ 0,  x)  =  0, 


=  0. 


?/  =  —  7/ 

^  rr*  rr*  rr*  rr*  1 
U  U  iav  <Ay  <Ay  <Ay 


0  <  x  <  1, 
t  >  0, 


models  the  vibrations  of  an  elastic  beam  of  unit  length  with  simply  supported  ends,  subject 
to  a  nonzero  initial  displacement  f(x)  and  zero  initial  velocity,  (a)  What  are  the  vibra¬ 
tional  frequencies  for  the  beam?  (b)  Write  down  the  solution  to  the  initial-boundary  value 
problem  as  a  Fourier  series,  (c)  Does  the  beam  vibrate  periodically 

(i)  for  all  initial  conditions?  (ii)  for  some  initial  conditions?  (m)  for  no  initial  conditions? 


7 /  7  / 

KAJ-L-L  VAJ  ry)  ryi  ryt  • 

U  U  iT  iT  iT  iT 


4.2.12.  Multiple  choice :  The  initial-boundary  value  problem 

u(t,  0)  =  uxx(t,  0)  =  u(t ,  1)  =  uxx(t,  1)  =  0,  0  <  x  <  1, 

u(0,x)  =  f(x),  ut(0,  x)  =  g(x),  t  >  0, 

is  well-posed  for  (a)  t  >  0;  (b)  t<  0;  (c)  all  t;  (d)  no  t.  Explain  your  answer. 


The  d’Alembert  Formula  for  Bounded  Intervals 

In  Theorem  2.15,  we  derived  the  explicit  d’Alembert  formula 


f(x-ct)  +  f(x  +  ct)  1 

u(t,  X)  = - h 


■X  +  C  t 


2c 


g{z)dz. 


(4.77) 


X  —  C  t 
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Figure  4.5.  Odd  periodic  extension  of  a  concentrated  pulse. 


for  solving  the  basic  initial  value  problem  for  the  wave  equation  on  an  infinite  interval: 


d 2 


u 


d 2 


u 


=  c 


u{  0,  x)  =  f(x) 


du 


(0,  x)  =  g(x), 


— OO  <  X  <  oo. 


dt 2  dx 2  v~7  7  17  v  /7  Qf 

In  this  section  we  explain  how  to  adapt  the  formula  in  order  to  solve  initial-boundary  value 
problems  on  bounded  intervals,  thereby  effectively  summing  the  Fourier  series  solution. 

The  easiest  case  to  deal  with  is  the  periodic  problem  on  0  <  x  <  £,  with  boundary 
conditions 

u(t,  0)  =  u(t,  £\  0)  =  ux(t,  £).  (4.78) 


If  we  extend  the  initial  displacement  f(x)  and  velocity  g(x)  to  be  periodic  functions  of 
period  £,  so  f(x-\-£)  =  f(x)  and  g(x  +  £)  =  g(x)  for  all  xGl,  then  the  resulting  d’Alembert 
solution  (4.77)  will  also  be  periodic  in  x,  so  u{t,x  +  £)  =  u(t,x).  In  particular,  it  satisfies 
the  boundary  conditions  (4.78)  and  so  coincides  with  the  desired  solution.  Details  are  to 
be  supplied  in  Exercises  4.2.27-28. 

Next,  suppose  we  have  fixed  (Dirichlet)  boundary  conditions 


u(t,  0)  =  0,  u(t,£)  =  0.  (4.79) 

The  resulting  solution  can  be  written  as  a  Fourier  sine  series  (4.68),  and  hence  is  both  odd 
and  2£-periodic  in  x.  Therefore,  to  write  the  solution  in  d’Alembert  form  (4.77),  we  extend 
the  initial  displacement  f(x)  and  velocity  g{x)  to  be  odd,  periodic  functions  of  period  2£: 


f(~x)  =  -f(x),  f(x  +  2£)  =  f(x),  g(-x)  =  -g(x),  g{x  +  21)  =  g(x). 

This  will  ensure  that  the  d’Alembert  solution  also  remains  odd  and  periodic.  As  a  result, 
it  satisfies  the  homogeneous  Dirichlet  boundary  conditions  (4.79)  for  all  t,  cf.  Exercise 
4.2.31.  Keep  in  mind  that,  while  the  solution  u(t,  x)  is  defined  for  all  x,  the  only  physically 
relevant  values  occur  on  the  interval  0  <  x  <  £.  Nevertheless,  the  effects  of  displacements 
in  the  unphysical  regime  will  eventually  be  felt  as  the  propagating  waves  pass  through  the 
physical  interval. 

For  example,  consider  an  initial  displacement  that  is  concentrated  near  x  =  £  for  some 
0  <  f  <  L  Its  odd  2£-periodic  extension  consists  of  two  sets  of  replicas:  those  of  the  same 
form  occurring  at  positions  £  =b  2£,  £  ±  4£,  . . .  ,  and  their  upside-down  mirror  images  at 
the  intermediate  positions  —  £,  —  £  =b  2£,  —  £  =b  4£,  . . .  ;  Figure  4.5  shows  a  representative 
example.  The  resulting  solution  begins  with  each  of  the  pulses,  both  positive  and  negative, 
splitting  into  two  half-size  replicas  that  propagate  with  speed  c  in  opposite  directions. 
When  a  left  and  right  moving  pulse  meet,  they  emerge  from  the  interaction  unaltered.  The 
process  repeats  periodically,  with  an  infinite  row  of  half-size  pulses  moving  to  the  right 
kaleidoscopically  interacting  with  an  infinite  row  moving  to  the  left. 

However,  only  the  part  of  this  solution  that  lies  on  0  <  x  <  £  is  actually  observed 
on  the  physical  string.  The  effect  is  as  if  one  were  watching  the  full  solution  as  it  passes 
by  a  window  of  length  £.  Such  observers  will  interpret  what  they  see  a  bit  differently.  To 
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Figure  4.6. 


Solution  to  wave  equation  with  fixed  ends. 


wit,  the  original  pulse  starting  at  position  o  <  £  <  l  splits  up  into  two  half-size  replicas 
that  move  off  in  opposite  directions.  As  each  half-size  pulse  reaches  an  end  of  the  string, 
it  meets  a  mirror-image  pulse  that  has  been  propagating  in  the  opposite  direction  from 
the  nonphysical  regime.  The  pulse  is  reflected  at  the  end  of  the  interval  and  becomes  an 
upside-down  mirror  image  moving  in  the  opposite  direction.  The  original  positive  pulse 
has  moved  off  the  end  of  the  string  just  as  its  mirror  image  has  moved  into  the  physical 
regime.  (A  common  physical  realization  is  a  pulse  propagating  down  a  jump  rope  that  is 
held  fixed  at  its  end;  the  reflected  pulse  returns  upside  down.)  A  similar  reflection  occurs 
as  the  other  half-size  pulse  hits  the  other  end  of  the  physical  interval,  after  which  the 
solution  consists  of  two  upside-down  half-size  pulses  moving  back  towards  each  other.  At 
time  t  =  t/c  they  recombine  at  the  point  £  —  £  to  instantaneously  form  a  full-sized,  but 
upside-down  mirror  image  of  the  original  disturbance  —  in  accordance  with  (4.74).  The 
recombined  pulse  in  turn  splits  apart  into  two  upside-down  half-size  pulses  that,  when  each 
collides  with  the  end,  reflect  and  return  to  their  original  upright  form.  At  time  t  =  2  £/c, 
the  pulses  recombine  to  exactly  reproduce  the  original  displacement.  The  process  then 
repeats,  and  the  solution  is  periodic  in  time  with  period  2 £/c. 

In  Figure  4.6,  the  first  picture  displays  the  initial  displacement.  In  the  second,  it  has 
split  into  left-  and  right-moving  half-size  clones.  In  the  third  picture,  the  left-moving  bump 
is  in  the  process  of  colliding  with  the  left  end  of  the  string.  In  the  fourth  picture,  it  has 
emerged  from  the  collision,  and  is  now  upside  down,  reflected,  and  moving  to  the  right. 
Meanwhile,  the  right-moving  pulse  is  starting  to  collide  with  the  right  end.  In  the  fifth 
picture,  both  pulses  have  completed  their  collisions  and  are  now  moving  back  towards  each 
other,  where,  in  the  last  picture,  they  recombine  into  an  upside-down  mirror  image  of  the 
original  pulse.  The  process  then  repeats  itself,  in  mirror  image,  finally  recombining  to  the 
original  pulse,  at  which  point  the  entire  process  starts  over. 

The  Neumann  (free)  boundary  value  problem 

i^=0-  (4-80) 

is  handled  similarly.  Since  the  solution  has  the  form  of  a  Fourier  cosine  series  in  ay  we 
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extend  the  initial  conditions  to  be  even  2£-periodic  functions 

f(-x)  =  f(x),  f(x +  2£)  =  f(x),  g(-x)=g(x),  g{x +  21)  =  g(x). 

The  resulting  d’Alembert  solution  (4.77)  is  also  even  and  2£-periodic  in  x,  and  hence 
satisfies  the  boundary  conditions,  cf.  Exercise  4.2.31(b).  In  this  case,  when  a  pulse  hits 
one  of  the  ends,  its  reflection  remains  upright,  but  becomes  a  mirror  image  of  the  original; 
a  familiar  physical  illustration  is  a  water  wave  that  reflects  off  a  solid  wall.  Further  details 
are  left  to  the  reader  in  Exercise  4.2.22 

In  summary,  we  have  now  studied  two  very  different  ways  to  solve  the  one-dimensional 
wave  equation.  The  first,  based  on  the  d’Alembert  formula,  emphasizes  their  particle-like 
aspects,  where  individual  wave  packets  collide  with  each  other,  or  reflect  at  the  boundary, 
all  the  while  maintaining  their  overall  form,  while  the  second,  based  on  Fourier  analysis, 
emphasizes  the  vibrational  or  wave-like  character  of  the  solutions.  Some  solutions  look 
like  vibrating  waves,  while  others  appear  much  more  like  interacting  particles.  But,  like 
the  proverbial  blind  men  describing  an  elephant,  these  are  merely  two  facets  of  the  same 
solution.  The  Fourier  series  formula  shows  how  every  particle-like  solution  can  be  decom¬ 
posed  into  its  constituent  vibrational  modes,  while  the  d’Alembert  formula  demonstrates 
how  vibrating  solutions  combine  into  moving  wave  packets. 

The  coexistence  of  particle  and  wave  features  is  reminiscent  of  the  long-running  his¬ 
torical  debate  over  the  nature  of  light.  Newton  and  his  disciples  proposed  a  particle-based 
theory,  anticipating  the  modern  concept  of  photons.  However,  until  the  beginning  of  the 
twentieth  century,  most  physicists  advocated  a  wave-like  or  vibrational  viewpoint.  Ein¬ 
stein’s  explanation  of  the  photoelectric  effect  served  to  resurrect  the  particle  interpretation. 
Only  with  the  establishment  of  quantum  mechanics  was  the  debate  resolved  —  light,  and, 
indeed,  all  subatomic  particles  manifest  both  particle  and  wave  features,  depending  upon 
the  experiment  and  the  physical  situation.  But  a  theoretical  basis  for  the  perplexing  wave- 
particle  duality  could  have  been  found  already  in  Fourier’s  and  d’Alembert’s  competing 
solution  formulae  for  the  classical  wave  equation! 


Exercises 


^  4.2.13.  (a)  Solve  the  initial-boundary  value  problem  from  Example  4.3  using  the  d’Alembert 
method,  (b)  Verify  that  your  solution  coincides  with  the  Fourier  series  solution  derived 
above,  (c)  Justify  our  earlier  observation  that,  at  each  time  t,  the  solution  u(t,  x)  is  a 
piecewise  affine  function  of  x. 


4.2.14.  Sketch  the  solution  of  the  wave  equation  utt  = 
the  initial  displacement  is  the  box  function  u( 0,  x)  = 


u __  and  describe  its  behavior  when 


while  the  initial 


1,  1  <  x  <  2, 

0,  otherwise, 

velocity  is  0  in  each  of  the  following  scenarios:  (a)  on  the  entire  line  —  oo  <  x  <  oo; 

(b)  on  the  half-line  0  <  x  <  oo,  with  homogeneous  Dirichlet  boundary  condition  at  the 
end;  (c)  on  the  half- line  0  <  x  <  oo,  with  homogeneous  Neumann  boundary  condition  at 
the  end;  (d)  on  the  bounded  interval  0  <  x  <  5  with  homogeneous  Dirichlet  boundary 
conditions;  (e)  on  the  bounded  interval  0  <  x  <  5  with  homogeneous  Neumann  boundary 
conditions. 
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4.2.15.  Answer  Exercise  4.2.14  when  the  initial  velocity  is  the  box  function,  while  the  initial 
displacement  is  zero. 

4.2.16.  Consider  the  initial-boundary  value  problem 

d2u  d2u  u(t,  0)  =  0  =  u(t,  10),  t  >  0, 

dt 2  dx 2  ’  u{ 0,  x)  =  /(#),  ut{ 0,  x)  =  0,  0  <  x  <  10, 

for  the  wave  equation,  where  the  initial  data  has  the  following  form: 

2.5  <  x  <  3, 

3  <  x  <  4.5, 

4.5  <  x  <  5, 
otherwise. 


f(x)  =  < 


3x  —  7.5, 

6  —  1.5x, 
l.bx  -  7.5, 
0, 


Discuss  what  happens  to  the  solution.  You  do  not  need  to  write  down  an  explicit  formula 
for  the  solution,  but  for  full  credit  you  must  explain  (sketches  can  help)  at  least  three  or 
four  interesting  things  that  happen  to  the  solution  as  time  progresses. 

4.2.17.  Repeat  Exercise  4.2.16  for  the  Neumann  boundary  conditions. 


4.2.18.  Suppose  the  initial  displacement  of  a  string  of  length  i  looks  like 
the  graph  to  the  right.  Assuming  that  the  ends  of  the  string  are  held 
fixed,  graph  the  string’s  profile  at  times  t  =  £/c  and  2 £/c. 


£  4.2.19.  Consider  the  wave  equation  utt  =  uxx  on  the  interval  0  <  x  <  1,  with  homogeneous 
Dirichlet  boundary  conditions  at  both  ends,  (a)  Use  the  d’Alembert  formula  to  explicitly 

solve  the  initial  value  problem  a(0,  x)  =  x  —  x  ,  ut(0,x)  =  0.  (b)  Graph  the  solution 
profile  at  some  representative  times,  and  discuss  what  you  observe,  (c)  Find  the  Fourier 
series  at  each  t  of  your  solution  and  compare  the  two.  (d)  How  many  terms  do  you  need 
to  sum  to  obtain  a  reasonable  approximation  to  the  exact  solution? 

£  4.2.20.  Solve  Exercise  4.2.19  for  the  initial  conditions  u(0,x)  =  0,  ut( 0,  x)  =  x2  —  x. 

X  4.2.21.  Solve  (i)  Exercise  4.2.19,  (ii)  Exercise  4.2.20,  when  the  solution  is  subject  to  homoge¬ 
neous  Neumann  boundary  conditions. 


^  4.2.22.  Under  what  conditions  is  the  solution  to  the  Neumann  boundary  value  problem  for  the 
wave  equation  on  a  bounded  interval  [ 0,  £]  periodic  in  time?  What  is  the  period? 


g(x)  =  0; 


4.2.23.  Discuss  and  sketch  the  behavior  of  the  solution  to  the  Neumann  boundary  value  prob¬ 
lem  utt  =  4 uxx,  0  <  x  <  1,  ux(t,  0)  =  0  =  ux(t,  1),  u(0,x)  =  f(x),  iq(0,  x)  =  g(x),  for 

1,  .2  <  x  <  .3, 

0,  otherwise. 

1,  .2  <  x  <  .3, 

0,  otherwise. 

4.2.24.  (a)  Explain  how  to  solve  the  Neumann  initial-boundary  value  problem 

d2u  d2u 


(a)  a  localized  initial  displacement:  f(x)  = 

(b)  a  localized  initial  velocity:  f(x)  =  0,  g(x)  = 


dt2  dx 2 


(t,  0)  =  0  =  (t,  1) 


dx 


dx 


u( 0,  x)  =  f(x), 


du 

dt 


(0,  x)  =  g(x), 


on  the  interval  0  <  x  <  1. 


x  —  -  -  <  x  <  - 

x  4  5  4  _  x  _  2  ’ 

3  _  rp 

4  •> 


i  <  x  <  |,  and  g(x)  =  0.  Sketch  the  graph  of  the  solution  at 


(b)  Let  f(x)  — 

i  ^  ' 

0,  otherwise. 

a  few  representative  times,  and  discuss  what  is  happening.  Is  the  solution  periodic  in 
time?  If  so,  what  is  the  period? 

(c)  Do  the  same  when  f(x)  =  0  and  g(x)  =  x. 
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4.2.25.  (a)  Write  down  a  formula  for  the  solution  u(t,  x )  to  the  initial-boundary  value  problem 


d2u  d2u 

- 4  -  =  0 

dU  <9x2  ’ 

(b)  Find  u 
period,  (d)  Does 


du  du  du 

u(0,  x)  =  sinx,  —  (0,  x)  =  — —  (£,  0)  =  — —  (t,  7r)  =  0,  0  <  x  <  tt,  t  >  0. 


dt 


dx 


dx 


(c)  Prove  that  h(t)  =  u  (t,  is  a  periodic  function  of  t  and  find  its 
du 

— —  have  any  discontinuities?  If  so,  discuss  their  behavior. 
ox 


4.2.26.  Answer  Exercise  4.2.25  for  the  mixed  boundary  conditions  u(t,  0)  =  0  =  ux(t,  tt). 

T  4.2.27.  (a)  Explain  how  to  use  d’Alembert’s  formula  (4.77)  to  solve  the  periodic  initial-boundary 
value  problem  for  the  wave  equation  given  in  Exercise  4.2.6. 

(b)  Do  the  d’Alembert  and  Fourier  series  formulae  represent  the  same  solution?  If  so,  can 
you  justify  it?  If  not,  explain  why  they  are  different. 


0  4.2.28.  Show  that  the  solution  u(t,x)  to  the  wave  equation  on  an  interval  [ 0,  £] ,  subject  to  pe¬ 
riodic  boundary  conditions  u(t,  0)  =  u(t,£),  ux(t,  0)  =  ux(t,£),  is  a  periodic  function  of  t  if 


ri 


4.2.29.  (a)  Explain  how  to  solve  the  wave  equation  on  a  half- line  x  >  0  when  subject  to  Dirich- 
let  boundary  conditions  u(t,  0)  =  0.  (b)  Assuming  c  =  1,  find  the  solution  satisfying 

u(0,  x)  =  (x  —  2)  e  2'2^  ,  ut( 0,  x)  =  0.  (c)  Sketch  a  picture  of  your  solution  at  some 
representative  times,  and  discuss  what  is  happening. 


4.2.30.  Solve  Exercise  4.2.29  for  homogeneous  Neumann  boundary  conditions  at  x  =  0. 

0  4.2.31.  (a)  Given  that  /(x)  is  odd  and  2f?-periodic,  explain  why  /( 0)  =  0  =  f{£). 

(b)  Given  that  /(x)  is  even  and  2t?-periodic,  explain  why  f'( 0)  =  0  =  f' (£). 

0  4.2.32.  (a)  Prove  that  if  /(—  x)  =  — /(x),  /(x  +  2£)  =  /(x),  for  all  x ,  then 

u(t,  x)  =  2  [f(x  ~  +  f(x  +  ]  satisfies  the  Dirichlet  boundary  conditions  (4.79). 

(b)  Prove  that  if  g(—x)  =  —g(x),  g(x  +  2£)  =  g(x)  for  all  x,  then 

\  rX  +  ct 

u(t,  x)=~—  g(z)  dz  also  satisfies  the  Dirichlet  boundary  conditions. 

2 c  Jx—ct 


4.2.33.  If  both  u(0  ,  x)  =  /(x)  and  ut( 0,  x)  =  g(x)  are  even  functions,  show  that  the  solution 
u(t,x)  of  the  wave  equation  is  even  in  x  for  all  t. 

4.2.34.  (a)  Prove  that  the  solution  u(t,  x)  to  the  wave  equation  for  x  E  R  is  an  even  function  of 
t  if  and  only  if  its  initial  velocity,  at  t  =  0,  is  zero. 

(b)  Under  what  conditions  is  u(t,x)  an  odd  function  of  £? 


^  4.2.35.  Let  u(t,  x)  be  a  classical  solution  to  the  wave  equation  utt  =  c2uxx  on  the  interval 

0  <  x  <  £,  satisfying  homogeneous  Dirichlet  boundary  conditions.  The  total  energy  of  u  at 
time  t  is 


(4.81) 


Establish  the  Law  of  Conservation  of  Energy  by  showing  that  Eft)  =  E( 0)  is  a  constant 
function. 


4.2.36.  (a)  Use  Exercise  4.2.35  to  prove  that  the  only  C2  solution  to  the  initial-boundary  value 
problem  vtt  =  c  vxxl  v(t,  0)  =  v(t,£)  =  0,  u(0,x)  =  0,  ut(0,x)  =  0,  is  the  trivial  solu¬ 
tion  v(t,x)  =  0.  (b)  Establish  the  following  Uniqueness  Theorem  for  the  wave  equation: 
given  /(x),g(x)  E  C2,  there  is  at  most  one  C2  solution  u(t,x)  to  the  initial-boundary  value 
problem  utt  =  c  uxxl  u(t,  0)  =  u(t,£)  =  0,  u( 0,  x)  =  /(x),  ut(0,x)  =  g(x). 

4.2.37.  Referring  back  to  Exercises  4.2.35  and  4.2.36:  (a)  Does  conservation  of  energy  hold  for 
solutions  to  the  homogeneous  Neumann  initial-boundary  value  problem? 

(b)  Can  you  establish  a  uniqueness  theorem  for  the  Neumann  problem? 
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4.2.38.  Explain  how  to  solve  the  Dirichlet  initial-boundary  value  problem 

utt  =  (?uxx  +  F(t,  x),  u(0,x)  =  f(x),  ut(0,x)  =  g(x),  u(t,  0) 

for  the  wave  equation  subject  to  an  external  forcing  on  the  interval  [0,  i\. 


u(t ,  i)  =  0. 


4.3  The  Planar  Laplace  and  Poisson  Equations 


The  two-dimensional  Laplace  equation  is  the  second-order  linear  partial  differential  equa¬ 
tion 


d2u  d2u 
dx 2  dy 2  ’ 


(4.82) 


named  in  honor  of  the  influential  eighteenth-century  French  mathematician  Pierre-Simon 
Laplace.  It,  along  with  its  higher-dimensional  versions,  is  arguably  the  most  important 
differential  equation  in  all  of  mathematics.  A  real- valued  solution  u(x,  y )  to  the  Laplace 
equation  is  known  as  a  harmonic  function.  The  space  of  harmonic  functions  can  thus  be 
identified  as  the  kernel  of  the  second-order  linear  partial  differential  operator 


d 2  d 2 

dx 2  dy 2  ’ 


(4.83) 


known  as  the  Laplace  operator ,  or  Laplacian  for  short.  The  inhomogeneous  or  forced 
version,  namely 


d2u  d2u 

a*-W=nX'V)' 


(4.84) 


is  known  as  Poisson7 s  equation ,  named  after  Simeon-Denis  Poisson,  who  was  taught  by 
Laplace.  The  mathematical  and  physical  reasons  for  including  the  minus  sign  will  gradually 
become  clear. 

Besides  their  theoretical  importance,  the  Laplace  and  Poisson  equations  arise  as  the 
basic  equilibrium  equations  in  a  remarkable  variety  of  physical  systems.  For  example,  we 
may  interpret  u(x,y)  as  the  displacement  of  a  membrane ,  e.g.,  a  drum  skin;  the  inhomo¬ 
geneity  /(x,  y)  in  the  Poisson  equation  represents  an  external  forcing  over  the  surface  of 
the  membrane.  Another  example  is  in  the  thermal  equilibrium  of  flat  plates;  here  u(x,  y) 
represents  the  temperature  and  f(x,y)  an  external  heat  source.  In  fluid  mechanics,  u(x,y) 
represents  the  potential  function  whose  gradient  v  =  Vw  is  the  velocity  vector  field  of  a 
steady  planar  fluid  flow.  Similar  considerations  apply  to  two-dimensional  electrostatic 
and  gravitational  potentials.  The  dynamical  counterparts  to  the  Laplace  equation  are  the 
two-dimensional  versions  of  the  heat  and  wave  equations,  to  be  analyzed  in  Chapter  11. 

Since  both  the  Laplace  and  Poisson  equations  describe  equilibrium  configurations,  they 
almost  always  appear  the  context  of  boundary  value  problems.  We  seek  a  solution  u(x,y) 
to  the  partial  differential  equation  defined  at  points  (x,  y)  belonging  to  a  bounded,  open 
domain  Vt  C  M2.  The  solution  is  required  to  satisfy  suitable  conditions  on  the  boundary 
of  the  domain,  denoted  by  <9f2,  which  will  consist  of  one  or  more  simple  closed  curves,  as 
illustrated  in  Figure  4.7.  As  in  one-dimensional  boundary  value  problems,  there  are  several 
especially  important  types  of  boundary  conditions. 
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Figure  4.7.  A  planar  domain  with  outward  unit  normals  on  its  boundary. 


The  first  are  the  fixed  or  Dirichlet  boundary  conditions ,  which  specify  the  value  of  the 
function  u  on  the  boundary: 


u(x,y)  =  h(x,y)  for  (x,y)  £  <9D.  (4.85) 

Under  mild  regularity  conditions  on  the  domain  Q,  the  boundary  values  /i,  and  the  forcing 
function  /,  the  Dirichlet  conditions  (4.85)  serve  to  uniquely  specify  the  solution  u(x,  y)  to 
the  Laplace  or  the  Poisson  equation.  Physically,  in  the  case  of  a  free  or  forced  membrane, 
the  Dirichlet  boundary  conditions  correspond  to  gluing  the  edge  of  the  membrane  to  a 
wire  at  height  h(x,y)  over  each  boundary  point  (x,y)  £  <9D,  as  illustrated  in  Figure  4.8. 
A  physical  realization  can  be  easily  obtained  by  dipping  the  wire  in  a  soap  solution;  the 
resulting  soap  film  spanning  the  wire  forms  a  minimal  surface ,  which,  if  the  wire  is  reason¬ 
ably  close  to  planar  shape, ^  is  the  solution  to  the  Dirichlet  problem  prescribed  by  the  wire. 
Similarly,  in  the  modeling  of  thermal  equilibrium,  a  Dirichlet  boundary  condition  repre¬ 
sents  the  imposition  of  a  prescribed  temperature  distribution,  represented  by  the  function 
/i,  along  the  boundary  of  the  plate. 

The  second  important  class  consists  of  the  Neumann  boundary  conditions 


du 

dn 


=  \7u  •  n  =  fc(x,  y) 


on  <9D, 


(4.86) 


in  which  the  normal  derivative  of  the  solution  u  on  the  boundary  is  prescribed.  In  general,  n 
denotes  the  unit  outwards  normal  to  the  boundary  <9D,  i.e.,  the  vector  of  unit  length,  ||  n  ||  = 
1,  that  is  orthogonal  to  the  tangent  to  the  boundary  and  points  away  from  the  domain;  see 
Figure  4.7.  For  example,  in  thermomechanics,  a  Neumann  boundary  condition  specifies 
the  heat  flux  out  of  a  plate  through  its  boundary.  The  “no- flux”  or  homogeneous  Neumann 
boundary  conditions,  where  k(x,y)  =  0,  correspond  to  a  fully  insulated  boundary.  In  the 
case  of  a  membrane,  homogeneous  Neumann  boundary  conditions  correspond  to  a  free, 
unattached  edge  of  a  drum.  In  fluid  mechanics,  the  Neumann  conditions  prescribe  the 
fluid  flux  through  the  boundary;  in  particular,  homogeneous  Neumann  boundary  conditions 


^  More  generally,  the  minimal  surface  formed  by  the  soap  film  solves  the  vastly  more  compli¬ 
cated  nonlinear  minimal  surface  equation  (1  +  ux)uxx  —  2uxuu  +  (1  +  u2)uyy  =  0,  which,  for 
surfaces  with  small  variation,  i.e.,  with  ||  Vu  ||  <C  1,  can  be  approximated  by  the  Laplace  equation. 


154 


4  Separation  of  Variables 


h(x,y) 


Figure  4.8.  Dirichlet  boundary  conditions. 


correspond  to  a  solid  boundary  that  the  fluid  cannot  penetrate.  More  generally,  the  Robin 
boundary  conditions 


du 

dn 


+  /3(x,y)u 


k(x,y) 


on  <9D, 


also  known  as  impedance  boundary  conditions  due  to  their  applications  in  electromag¬ 
netism,  are  used  to  model  insulated  plates  in  heat  baths,  or  membranes  attached  to  springs. 

Finally,  one  can  mix  the  previous  kinds  of  boundary  conditions,  imposing,  say,  Dirich¬ 
let  conditions  on  part  of  the  boundary  and  Neumann  conditions  on  the  complementary 
part.  A  typical  mixed  boundary  value  problem  has  the  form 


c)u 

—  A  u  — f  in  O,  u  —  h  on  D,  — —  =  k  on  A,  (4.87) 

an 

with  the  boundary  dfl  =  D  UN  being  the  disjoint  union  of  a  “Dirichlet  segment”,  denoted 
by  D,  and  a  “Neumann  segment”  N.  For  example,  if  u  represents  the  equilibrium  tem¬ 
perature  in  a  plate,  then  the  Dirichlet  segment  of  the  boundary  is  where  the  temperature 
is  fixed,  while  the  Neumann  segment  is  insulated,  or,  more  generally,  has  prescribed  heat 
flux.  Similarly,  when  modeling  the  displacement  of  a  membrane,  the  Dirichlet  segment  is 
where  the  edge  of  the  drum  is  attached  to  a  support,  while  the  homogeneous  Neumann 
segment  is  left  hanging  free. 


Exercises 

4.3.1.  (a)  Solve  the  boundary  value  problem  A u  =  1  for  x2  +  y2  <  1  and  u(x,y)  =  0  for 

x2  +  y2  =  1  directly.  Hint :  The  solution  is  a  simple  polynomial. 

(b)  Graph  your  solution,  interpreting  it  as  the  equilibrium  displacement  of  a  circular  drum 
under  a  constant  gravitational  force. 

4.3.2.  Set  up  the  boundary  value  problem  corresponding  to  the  equilibrium  of  a  circular 
membrane  subject  to  a  constant  downwards  gravitational  force,  half  of  whose  boundary  is 
glued  to  a  flat  semicircular  wire,  while  the  other  half  is  unattached. 

4.3.3.  Set  up  the  boundary  value  problem  corresponding  to  the  thermal  equilibrium  of  a 
rectangular  plate  that  is  insulated  on  two  of  its  sides,  has  0°  at  its  top  edge  and  100°  at  the 
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bottom  edge.  Where  do  you  expect  the  maximum  temperature  to  be  located?  What  is  its 
value?  Can  you  find  a  formula  for  the  temperature  inside  the  plate?  Hint :  The  solution  is 
constant  along  horizontal  lines. 

4.3.4.  Set  up  the  boundary  value  problem  corresponding  to  the  thermal  equilibrium  of  an 
insulated  semi-circular  plate  with  unit  diameter,  whose  curved  edge  is  kept  at  0°  and  whose 
straight  edge  is  at  50°. 

4.3.5.  Explain  why  the  solution  to  the  homogeneous  Neumann  boundary  value  problem  for  the 
Laplace  equation  is  not  unique. 

4.3.6.  Write  down  the  Dirichlet  boundary  value  problem  for  the  Laplace  equation  on  the  unit 
square  0  <  x,  y  <  1  that  is  satisfied  by  u(x,  y)  =  1  +  xy. 

4.3.7.  Write  down  the  Neumann  boundary  value  problem  for  the  Poisson  equation  on  the  unit 

o  9  09 

disk  x  +  y  <1  that  is  satisfied  by  u(x,  y)  =  x  +  xy  . 

0  4.3.8.  Suppose  u(x,  y)  is  a  solution  to  the  Laplace  equation. 

(a)  Show  that  any  translate  U (x,  y)  —  u(x  —  a,  y  —  6),  where  a,  b  E  R,  is  also  a  solution. 

(b)  Show  that  the  rotated  function  U(x,y)  =  u(xcos6  +  ysinO,  —  xsin 6  +  ycosO),  where 
—  7r  <  0  <  7r,  is  also  a  solution. 

0  4.3.9.  (a)  Show  that  if  u(x,y)  solves  the  Laplace  equation,  then  so  does  the  rescaled  function 
U(x,y)  =  cu(ax,  ay)  for  any  constants  c,  a. 

(b)  Discuss  the  effect  of  scaling  on  the  Dirichlet  boundary  value  problem. 

(c)  What  happens  if  we  use  different  scaling  factors  in  x  and  yl 
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Our  first  approach  to  solving  the  Laplace  equation 


A  u 


d2u  d2u 
dx2  +  dy 2 


0 


(4.88) 


will  be  based  on  the  method  of  separation  of  variables.  As  in  (4.64),  we  seek  solutions  that 
can  be  written  as  a  product 

u(x,y)  =  v(x)  w(y)  (4.89) 


of  a  function  of  x  alone  times  a  function  of  y  alone.  We  compute 


and  so 
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dx 2 


v"(x)  w(y) 
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u 


dy- 


v(x)  w"(y) 


.  d2u  d2u  „ 
A  u  =  ' 


+  ttw  =  v"(x)  w(y)  +  v(x)  w"(y)  =  0 


dx2  dy2 


We  then  separate  the  variables  by  placing  all  the  terms  involving  x  on  one  side  of  the 
equation  and  all  the  terms  involving  y  on  the  other;  this  is  accomplished  by  dividing  by 
v(x)  w(y)  and  then  writing  the  resulting  equation  in  the  separated  form 


vn(x)  w>"(y) 

v(x)  w(y) 


A. 


(4.90) 
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As  we  argued  in  (4.65),  the  only  way  a  function  of  x  alone  can  be  equal  to  a  function  of  y 
alone  is  if  both  functions  are  equal  to  a  common  separation  constant  A.  Thus,  the  factors 
v(x)  and  w(y)  must  satisfy  the  elementary  ordinary  differential  equations 

v"  —  A  v  =  0,  w"  +  A  w  =  0. 

As  before,  the  solution  formulas  depend  on  the  sign  of  the  separation  constant  A.  We  list 
the  resulting  collection  of  separable  harmonic  functions  in  the  following  table: 


Separable  Solutions  to  Laplace’s  Equation 


Since  Laplace’s  equation  is  a  homogeneous  linear  system,  any  linear  combination  of 
solutions  is  also  a  solution.  So,  we  can  build  more  general  solutions  as  finite  linear  combi¬ 
nations,  or,  provided  we  pay  proper  attention  to  convergence  issues,  infinite  series  in  the 
separable  solutions.  Our  goal  is  to  solve  boundary  value  problems,  and  so  we  must  ensure 
that  the  resulting  combination  satisfies  the  boundary  conditions.  But  this  is  not  such  an 
easy  task,  unless  the  underlying  domain  has  a  rather  special  geometry. 

In  fact,  the  only  bounded  domains  on  which  we  can  explicitly  solve  boundary  value 
problems  using  the  preceding  separable  solutions  are  rectangles.  So,  we  will  concentrate 
on  boundary  value  problems  for  Laplace’s  equation 

A  u  =  0  on  a  rectangle  R  =  {0  <  x  <  a,  0  <  y  <  b}.  (4.91) 

To  make  progress,  we  will  allow  nonzero  boundary  values  on  only  one  of  the  four  sides  of 
the  rectangle.  To  illustrate,  we  will  focus  on  the  following  Dirichlet  boundary  conditions: 

u(x,0)  =  /(x),  u(x,b)  =  0,  u(0,y)  =  0,  u(a,y)  =  0.  (4.92) 

Once  we  know  how  to  solve  this  type  of  problem,  we  can  employ  linear  superposition  to 
solve  the  general  Dirichlet  boundary  value  problem  on  a  rectangle;  see  Exercise  4.3.12  for 
details.  Other  boundary  conditions  can  be  treated  in  a  similar  fashion  —  with  the  proviso 
that  the  condition  on  each  side  of  the  rectangle  is  either  entirely  Dirichlet  or  entirely 
Neumann  or,  more  generally,  entirely  Robin  with  constant  transfer  coefficient. 

To  solve  the  boundary  value  problem  (4.91-92),  the  first  step  is  to  narrow  down  the 
separable  solutions  to  only  those  that  respect  the  three  homogeneous  boundary  conditions. 
The  separable  function  u(x,y)  =  v{x)w{y)  will  vanish  on  the  top,  right,  and  left  sides  of 
the  rectangle,  provided 


u(0)  =  v(a)  =  0  and  w(b)  =  0. 
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Referring  to  the  preceding  table,  the  first  condition  'c(O)  =  0  requires 


v(x) 


smcax, 

x, 

sinhcex. 


A  =  —  (jj2  <  0, 
A  =  0, 

A  =  ce2  >  0, 


where  sinhz  =  \(ez  —  e~z)  is  the  usual  hyperbolic  sine  function, 
and  third  cases  cannot  satisfy  the  second  boundary  condition  v(a) 
them.  The  first  case  leads  to  the  condition 


However,  the  second 
■  0,  and  so  we  discard 


v(a)  =  since  a  =  0,  and  hence  ce  a  =  7r,  2i r,  37r,  ...  . 

The  corresponding  separation  constants  and  solutions  (up  to  constant  multiple)  are 


A.  =  —  ce2  =  - 
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2  2 
n  7 r 


a * 


vn(x)  =  sin 


T17TX 

a 


tl  —  1,  2,  3, ... 


(4.93) 


Note:  So  far,  we  have  merely  recomputed  the  known  eigenvalues  and  eigenfunctions 
of  the  familiar  boundary  value  problem  v"  —  Xv  =  0,  w(0)  =  v(a)  =  0. 

Next,  since  A  =  — ce2  <  0,  we  have  w(y)  =  c1eujy  +  c2e~ujy  for  constants  c1?c2.  The 
third  boundary  condition  w (6)  =0  then  requires  that,  up  to  constant  multiple, 


nn(b  —  y ) 


a 


(4.94) 


wn(y )  =  sinh  uj  (b  —  y)  =  sinh 
We  conclude  that  the  harmonic  functions 

u  (x,  y)  =  sin  n7TX  sinh  n7T^J — Ul  n—  12,3,...,  (4.95) 

provide  a  complete  list  of  separable  solutions  that  satisfy  the  three  homogeneous  boundary 
conditions.  It  remains  to  analyze  the  inhomogeneous  boundary  condition  along  the  bottom 
edge  of  the  rectangle.  To  this  end,  let  us  try  a  linear  superposition  of  the  relevant  separable 
solutions  in  the  form  of  an  infinite  series 
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oo 
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(X,V)  =  CnUn{X^)  =  N 


.  nixx  .  ,  tut  (b  —  y) 
e  sm -  sinh  — 
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a 


a 


n—  1 


n—  1 


whose  coefficients  cl5  c2, . . .  are  to  be  prescribed  by  the  remaining  boundary  condition.  At 
the  bottom  edge,  y  —  0,  we  find 


u 


/  i  nirb  .  tittx  x 

(*.  °)  =  cn smh  — sm  =  f(x> 


a 


0  <  x  <  a, 


(4.96) 


n  =  1 


which  takes  the  form  of  a  Fourier  sine  series  for  the  function  f(x).  Let 


,  _  2 
b n  a 


‘CL 


\  .  n  7TX  7 
j(x)  sm - ax 


o 


a 


(4.97) 


be  its  Fourier  sine  coefficients,  whence  cn  =  bn/  smh^nnb/a) .  We  thus  anticipate  that  the 
solution  to  the  boundary  value  problem  can  be  expressed  as  the  infinite  series 


u 


(x,y)  = 


7  .  n nx  .  .  nn(b  —  y) 

00  m  sm -  smh - 1 - — 

n  a  a 


n  =  1 


sinh 
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(4.98) 
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Figure  4.9.  Square  membrane  on  a  wire. 


Does  this  series  actually  converge  to  the  solution  to  the  boundary  value  problem? 
Fourier  analysis  says  that,  under  very  mild  conditions  on  the  boundary  function  /(x),  the 
answer  is  yes.  Suppose  that  its  Fourier  coefficients  are  uniformly  bounded, 


5  <  M 


n 


for  all 


n  >  1 


(4.99) 


which,  according  to  (4.27),  is  true  whenever  f(x)  is  piecewise  continuous  or,  more  generally, 

pCL 

integrable:  /  |  f{x)  \  dx  <  oo.  In  this  case,  as  you  are  asked  to  prove  in  Exercise  4.3.20, 

Jo 

the  coefficients  of  the  Fourier  sine  series  (4.98)  go  to  zero  exponentially  fast: 


sinh 


nn(b  —  y ) 
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sinh 


nub 
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>  0 


as  n  — >  oo  for  all  0  <  y  <  b, 


(4.100) 


and  so,  at  each  point  inside  the  rectangle,  the  series  can  be  well  approximated  by  partial 
summation.  Theorem  3.31  tells  us  that,  for  each  0  <  y  <  6,  the  solution  u(x,y)  is  an 
infinitely  differentiable  function  of  x.  Moreover,  by  term- wise  differentiation  of  the  series 
with  respect  to  y  and  use  of  Proposition  3.28,  we  also  establish  that  the  solution  is  infinitely 
differentiable  with  respect  to  y\  see  Exercise  4.3.21.  (In  fact,  as  we  shall  see,  solutions  to 
the  Laplace  equation  are  always  analytic  functions  inside  their  domain  of  definition  —  even 
when  their  boundary  values  are  rather  rough.)  Since  the  individual  terms  all  satisfy  the 
Laplace  equation,  we  conclude  that  the  series  (4.98)  is  indeed  a  classical  solution  to  the 
boundary  value  problem. 


Example  4.4.  A  membrane  is  stretched  over  a  wire  in  the  shape  of  a  unit  square 
with  one  side  bent  in  half,  as  graphed  in  Figure  4.9.  The  precise  boundary  conditions  are 
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The  Fourier  sine  series  of  the  inhomogeneous  boundary  function  is  readily  computed: 


f(x) 


x , 

1  —  x. 


0  <  x  < 

\  <  x  <  1, 


sui7nr  — 


sin  3  77  x  sin  5  77  x 
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sin(2  j  +  1)t tx 
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Specializing  (4.98)  to  a  =  b  =  1,  we  conclude  that  the  solution  to  the  boundary  value 
problem  can  be  expressed  as  a  Fourier  series 


u 


y,  •  sin(2j  +  1)  7rx  sinh(2j  +  1)  7r(l  -  y) 
’  ^  (2j  +  l)2  sinh(2j  +  l)7r 


In  Figure  4.9  we  plot  the  sum  of  the  first  10  terms  in  the  series.  This  gives  a  reasonably  good 
approximation  to  the  actual  solution,  except  when  we  are  very  close  to  the  raised  corner 
of  the  boundary  wire  —  which  is  the  point  of  maximal  displacement  of  the  membrane. 


Exercises 

4.3.10.  Solve  the  following  boundary  value  problems  for  Laplace’s  equation  on  the  square 

U  =  {  0  <  X  <  77,  0  <  7/  <  7T  }. 

(a)  u(x,  0)  =  sin'  x,  u(x,  tt)  =  0,  77(0, 7/)  =  0,  77(77,7/)  =0. 

(b)  7i(x,  0)  =  0,  tx(x,7t)  =  0,  tx(0,  y)  =  sin//,  77(77, 7/)  =  0. 

(c)  77(x,  0)  =  0,  77(x,  7r)  —  1,  77(0, 7/)  =  0,  77(77, 7/)  =  0. 

(d)  7x(x,  0)  =  0,  7x(x,7r)  =  0,  77(0,  7/)  =  0,  77(77,  7/)  =  7/(77  —  7/). 

0  4.3.11.  (a)  Explain  how  to  use  linear  superposition  to  solve  the  boundary  value  problem 

Au  =  0,  u(x,0)  =  f(x),  u(x,  b)  =  g(x),  w(0 ,y)  =  h(y),  u(a,y)  =  k(y), 

on  the  rectangle  R  =  {0  <  x  <  a,  0  <  7/  <  6},  by  splitting  it  into  four  separate  boundary 
value  problems  for  which  each  of  the  solutions  vanishes  on  three  sides  of  the  rectangle. 

(b)  Write  down  a  series  formula  for  the  resulting  solution. 

4.3.12.  Solve  the  following  Dirichlet  problems  for  Laplace’s  equation  on  the  unit  square 
S'  =  {0<x,7/<  1}.  Hint :  Use  superposition  as  in  Exercise  4.3.11. 

(a)  77(x,  0)  =  sin77X,  77(x,  1)  =  0,  77(0,  y)  =  sin  777/,  77(1,  y)  =  0; 

(b)  77(x,0)  =  l,  77(x,  1)  =  0,  77(0,7/)  =  1,  77(1,7/)  =  0; 

(c)  77(x,0)  =  l,  77(x,  1)  =  1,  77(0,  y)  =  0,  77(1,  y)  =  0; 

(d)  77 (x,  0)  =  x,  77(x,  1)  =  1  —  x,  77(0,  y)  =  7/,  77(1, 7/)  =  1  —  y. 

4.3.13.  Solve  the  following  mixed  boundary  value  problems  for  Laplace’s  equation  Au  =  0  on 

the  square  S  =  {0  <  x,t/  <  77}. 

(a)  77(x,  0)  =  sin  ^  x,  77^  (x,  77)  =  0,  77(0,  y)  =  0,  77^(77,  y)  =  0; 

"I 

(b)  77(x,  0)  =  sin  ^  x,  77^(x,  77)  =  0,  77x(0,t/)  =  0,  77^(77,  y)  =  0; 

(c)  77(x,  0)  =  X,  77(x,  77)  —  0,  77^(0,  7/)  =  0,  77^  (77,  7/)  =  0; 

(d)  77(x,  0)  =  X,  77 (x,  77)  =  0,  77(0,  7/)  =  0,  77x(77,  7/)  =  0. 

4.3.14.  Find  the  solution  to  the  boundary  value  problem 

wy(a;,0)  =  uy(x,  2)  =  0,  0  <  x  <  1, 

’  u(0, y)  =  2cos7ry  —  1,  u(l,j/)  =  0,  0  <  y  <  2. 
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4.3.15. 


Find  the  solution  to  the  boundary  value  problem 


Au  =  0, 


u(x,  0)  =  2  cos  7 7 xx  —  4,  u(x,  1)  = 

Ux(0’V)  =  ux(liV)  =  °> 


5  COS  37TX, 


0  <  x,  y  <  1. 


4.3.16.  Let  u(x,y)  be  the  solution  to  the  boundary  value  problem 

Au  =  0,  u(x,  —1)  =  /(x),  u(x,  1)  =  0,  u(—l,  y)  =  0,  u{  1,  y)  =  0,  —  1  <  x  <  1,  —  1  <  y  <  1. 

(a)  True  or  false:  If  /(—  x)  =  — /(x)  is  odd,  then  u(0,y)  =  0  for  all  —  1  <  y  <  1. 

(b)  True  or  false:  If  /(0)  =  0,  then  a(0,  y)  =  0  for  all  —  1  <  y  <  1. 

(c)  Under  what  conditions  on  f(x)  is  u(x,  0)  =  0  for  all  —  1  <  x  <  1? 

4.3.17.  Use  separation  of  variables  to  solve  the  following  boundary  value  problem: 

uxx  +  2uy  +  uyy  =  0,  w(x,0)  =  0,  fz(x,  1)  = /(x),  w(0,  y)  =  0,  u(l,y)  =  0. 

4.3.18.  Use  separation  of  variables  to  solve  the  Helmholtz  boundary  value  problem  Au  =  iq 

u(x,  0)  =  0,  u(x,  1)  =  /(x),  u( 0,  y)  —  0,  u(  1,  y)  =  0,  on  the  unit  square  0  <  x,  y  <  1. 

0  4.3.19.  Provide  the  details  for  the  derivation  of  (4.94). 

^  4.3.20.  Justify  the  statement  that  if  |  bn  |  <  M  are  uniformly  bounded,  then  the  coefficients 
given  in  (4.100)  go  to  zero  exponentially  fast  as  n  — oo  for  any  0  <  y  <  b. 

4.3.21.  Let  u(x,y)  denote  the  solution  to  the  boundary  value  problem  (4.91-92). 

(a)  Write  down  the  Fourier  sine  series  for  du/dy .  (b)  Prove  that  du/dy  is  an  infinitely 

differentiable  function  of  x.  (c)  Justify  the  same  result  for  the  functions  dku/dyk  for  each 
k  >  0.  Hint:  Don’t  forget  that  u(x,  y)  solves  the  Laplace  equation. 


Polar  Coordinates 

The  method  of  separation  of  variables  can  be  successfully  exploited  in  certain  other  very 
special  geometries.  One  particularly  important  case  is  a  circular  disk.  To  be  specific,  let 
us  take  the  disk  to  have  radius  1  and  be  centered  at  the  origin.  Consider  the  Dirichlet 
boundary  value  problem 

Au  =  0,  x2  +  y2  <  1,  and  u  —  h ,  x2 y2  =  1,  (4.101) 

so  that  the  function  u(x,y)  satisfies  the  Laplace  equation  on  the  unit  disk  and  the  specified 
Dirichlet  boundary  conditions  on  the  unit  circle.  For  example,  u(x,  y)  might  represent  the 
displacement  of  a  circular  drum  that  is  attached  to  a  wire  of  height 

h(x,  y)  =  /i(cos$,  sin#)  =  h(6),  —7r<6<7r:  (4.102) 

at  each  point  (x,y)  =  (cos  0,  sin  9)  on  its  edge. 

The  rectangular  separable  solutions  are  not  particularly  helpful  in  this  situation,  and 
so  we  look  for  solutions  that  are  better  adapted  to  a  circular  geometry.  This  inspires  us  to 
adopt  polar  coordinates 

x  =  rcos$,  7/  =  rsin$,  or  r  —  \J x1  +  y2  ,  6  =  tan-1  —  ,  (4.103) 

X 

and  write  the  solution  u(r,  9)  as  a  function  thereof. 

Warning :  We  will  often  retain  the  same  symbol,  e.g.,  u,  when  rewriting  a  function 
in  a  different  coordinate  system.  This  is  the  convention  of  tensor  analysis,  physics,  and 
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differential  geometry,  [3],  that  treats  the  function  (scalar  field)  as  an  intrinsic  object,  which 
is  concretely  realized  through  its  formula  in  any  chosen  coordinate  system.  For  instance, 
if  u(x,y)  =  x2  +  2y  in  rectangular  coordinates,  then  its  expression  in  polar  coordinates 
is  u(r,9)  =  (rcos#)2  +  2rsin$,  not  r2  +26.  This  convention  avoids  the  inconvenience  of 
having  to  devise  new  symbols  when  changing  coordinates. 


We  need  to  relate  derivatives  with  respect  to  x  and  y  to  those  with  respect  to  r  and 
9.  Performing  a  standard  multivariate  chain  rule  computation  based  on  (4.103),  we  obtain 


d  d  <9 

=  cos  6  — - b  sm  6 


dr 

d 

d9 


dx 


dy 


n  d  n  d 

—  —r  sin  6  — — b  r  cos  6 


so 


dx 


dy 


d  d  sin  6  d 

dx  dr  r  dO  ’ 

d  .  d  cos  9  d 
=  sm  (9  —  4 - 


(4.104) 


dy 


dr 


r 


d9 


Applying  the  squares  of  the  latter  differential  operators  to  u(r,  0),  we  find,  after  a  calcula¬ 
tion  in  which  many  of  the  terms  cancel,  the  polar  coordinate  form  of  the  Laplace  equation : 


.  d2u  d2u 

A  u  =  — -  + 


d2u  1  du  1  d2u 

4 - X - 1 - “  7T7T  =  0. 


dx2  dy2  dr2  r  dr  r2  d92 


(4.105) 


The  boundary  conditions  are  imposed  on  the  unit  circle  r  —  1,  and  so,  by  (4.102),  take  the 
form 

u(l,0)  =  h{9).  (4.106) 

Keep  in  mind  that,  in  order  to  be  single-valued  functions  of  x,y,  the  solution  u(r,0)  and 
its  boundary  values  h(0)  must  both  be  27r-periodic  functions  of  the  angular  coordinate: 


u(r,  9  +  2tt)  =  u(r,  0),  h{9  +  2tt)  =  h{0). 

Polar  separation  of  variables  is  based  on  the  ansatz 


(4.107) 


u(r,9)  =  v(r)  w(0),  (4.108) 

which  assumes  that  the  solution  is  a  product  of  functions  of  the  individual  variables.  Sub¬ 
stituting  (4.108)  into  the  polar  form  (4.105)  of  Laplace’s  equation  yields 

v"(r)  w(9 )  4 —  v'(r)  w(9 )  4 — w  v(r)  w" (9)  =  0. 

We  now  separate  variables  by  moving  all  the  terms  involving  r  onto  one  side  of  the  equation 
and  all  the  terms  involving  9  onto  the  other.  This  is  accomplished  by  first  multiplying  the 
equation  by  r2/ ('u(r)  w(0) )  and  then  moving  the  final  term  to  the  right-hand  side: 

r2  v"(r)  +  r  v'(r)  w"{9) 

v(r)  w{0) 

As  in  the  rectangular  case,  a  function  of  r  can  equal  a  function  of  9  if  and  only  if  both  are 
equal  to  a  common  separation  constant,  which  we  call  A.  The  partial  differential  equation 
thus  splits  into  a  pair  of  ordinary  differential  equations 

r2  v"  +  r  v'  —  A  v  =  0,  w"  +  A  w  =  0,  (4.109) 

that  will  prescribe  the  separable  solution  (4.108).  Observe  that  both  have  the  form  of  an 
eigenfunction  equation  in  which  the  separation  constant  A  plays  the  role  of  the  eigenvalue. 
We  are,  as  always,  interested  only  in  nonzero  solutions. 
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We  have  already  solved  the  eigenvalue  problem  for  w{6).  According  to  (4.107), 
w(6  +  27r)  =  w{9)  must  be  a  27r-periodic  function.  Therefore,  by  our  earlier  discussion, 
this  periodic  boundary  value  problem  has  the  nonzero  eigenfunctions 


1,  sin  770,  cos  770,  77  =  1,2,...,  (4.110) 

corresponding  to  the  eigenvalues  (separation  constants) 

o 

A  =  n  ,  n  —  0, 1,  2, ... . 

With  the  value  of  A  fixed,  the  linear  ordinary  differential  equation  for  the  radial  component, 


r2v"  +  rv'  —  n2v  =  0,  (4.111) 

does  not  have  constant  coefficients.  But,  fortunately,  it  has  the  form  of  a  second-order  Euler 
ordinary  differential  equation ,  [23,  89],  and  hence  can  be  readily  solved  by  substituting  the 
power  ansatz  v{r)  =  rk .  (See  also  Exercise  4.3.23.)  Note  that 


v'(r)  =  kr 


k- 1 


v"(r)  =  k  (k  —  1)  rk  2, 


and  hence,  by  substituting  into  the  differential  equation, 


r2v"  +  rv'  —  n2v  =  k  (k  —  1)  +  k  —  n2  rk  =  ( k 2  — 


,k 


Thus,  rk  is  a  solution  if  and  only  if 


k2  —  n2  =  0. 


and  hence 


k  =  ±  n. 


For  n  /  0,  we  have  found  the  two  linearly  independent  solutions: 


v1(r)=rn,  v2(r)  =  r  n,  n  =  l,2, ....  (4.112) 

When  n  —  0,  the  power  ansatz  yields  only  the  constant  solution.  But  in  this  case,  the 
equation  r2v"  +  rv'  =  0  is  effectively  of  first  order  and  linear  in  v' ,  and  hence  readily 
integrated.  This  provides  the  two  independent  solutions 


vi(r)  =  1,  ^2(r)=^°Sr?  n  —  0.  (4.113) 

Combining  (4.110)  and  (4.112-113),  we  produce  the  complete  list  of  separable  polar  coor¬ 
dinate  solutions  to  the  Laplace  equation: 

1,  rn  cos  n0,  rn  sin  77  0, 

77=1,2,3,....  (4.114) 

i  — n  f\  — n  •  r\  1  11  v  / 

logr,  r  cos  77  0,  r  sui77  0, 

Now,  the  solutions  in  the  top  row  of  (4.114)  are  continuous  (in  fact  analytic)  at  the  origin, 
where  r  —  0,  whereas  the  solutions  in  the  bottom  row  have  singularities  as  r  — 0.  The 
latter  are  not  of  use  in  the  present  situation,  since  we  require  that  the  solution  remain 
bounded  and  smooth  —  even  at  the  center  of  the  disk.  Thus,  we  should  use  only  the 
nonsingnlar  solutions  to  concoct  a  candidate  series  solution 

oo 

+  (anrn  cos  770  +  6nrn  sin  770  ). 

n—  1 


77(7*,  0) 


2 


(4.115) 
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The  coefficients  an,5n  will  be  prescribed  by  the  boundary  conditions  (4.106).  Substituting 
r  =  1,  we  obtain 

oo 

Qj 

u(  1,  0)  =  —  +  ( an  cos n6  +  bn  sin n0 )  =  /i($). 

n  =  1 

We  recognize  this  as  a  standard  Fourier  series  (3.29)  (with  0  replacing  x)  for  the  2  tv  periodic 
function  h{0).  Therefore, 

1  r  1  r 

an  =  —  /  /i(0)  cosn#  dO,  6n  =  —  /  h(9)  sin n6  d6,  (4.116) 

^  J-7T  K  J- 7T 

are  precisely  its  Fourier  coefficients,  cf.  (3.35).  In  this  manner,  we  have  produced  a  series 
solution  (4.115)  to  the  boundary  value  problem  (4.105-106). 

Remark :  Introducing  the  complex  variable 

£  =  x  +  iy  =  r  e10  =  r  cos6  +  ir  sin#  (4.117) 

allows  us  to  write 

zn  =  rn  eind  =  rn  cos n6  +  i rn  sin n6.  (4.118) 

Therefore,  the  nonsingular  separable  solutions  are  the  harmonic  polynomials 

rn  cos  n6  =  Re  zn,  rn  smnO  =  Im  zn .  (4.119) 

The  first  few  are  listed  in  the  following  table: 


n 

Re  zn 

Im  zn 

0 

1 

0 

1 

X 

y 

2 

2  2 

x  —  i/ 

2  xy 

3 

x3  —  3  xy2 

3x2  y  —  y3 

4 

x 4  —  Ax2y2  +  y4 

4x3  y  —  Axy3 

Their  general  expression  is  obtained  using  the  Binomial  Formula: 


zn  = 


(x  +  iy)' 
xn  +  nxn~1(iy)  + 


n\Xn-\iyf  +  r)X^-\iyf  + 


+  (iy) 


n 


—  xn  +  i  nx'L  x  y  — 


n— 1 


n 


x 


n  —  2  „  .2 


n 


n— 3  „  .3 


y  -  1  (  3  ]  x”  “  y  + 


where 


k\  (n  —  k) ! 


(4.120) 
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Figure  4.10.  Membrane  attached  to  a  helical  wire. 


are  the  usual  binomial  coefficients.  Separating  the  real  and  imaginary  terms,  we  produce 
the  explicit  formulae 


n 


rn  cos  n6  =  Re  zn  =  xn  —  |  i  x 


n  —  2  „  .2 


n 


y  +  i  4  i  x 


n— 4  „  .4 


y  + 


n 


rn  sin  riO  =  Im  zn  =  nxn  1  y  —  \  ^  \  x 


n  —  3  „  .3 


n 


(4.121) 


y  +  |  5  1  x 


n  —  5  „  .5 


y  + 


for  the  two  independent  harmonic  polynomials  of  degree  n. 


Example  4.5.  Consider  the  Dirichlet  boundary  value  problem  on  the  unit  disk  with 


u(l16)  =  6  for  —  7r  <  9  <  7T.  (4.122) 

The  boundary  data  can  be  interpreted  as  a  wire  in  the  shape  of  a  single  turn  of  a  spiral 
helix  sitting  over  the  unit  circle.  The  wire  has  a  single  jump  discontinuity,  of  magnitude 
2tt,  at  the  boundary  point  (—1,0).  The  required  Fourier  series 

_  / /I \  ^  0  „  sin2#  sin3#  sin4# 

h(6)  =9  -  2  sin  9 - + - +  •  •  • 

v  ;  V  2  3  4 

was  already  computed  in  Example  3.3.  Therefore,  invoking  our  solution  formula  (4.115 
116),  we  have 


,  „  r2  sin  29  r3  sin  39  r4  sin  49 

u(r,9 )  =  2  rsin  0 - 1 - - - h 


(4.123) 


is  the  desired  solution,  which  is  plotted  in  Figure  4.10.  In  fact,  this  series  can  be  explicitly 
summed.  In  view  of  (4.119)  and  the  usual  formula  (A.  13)  for  the  complex  logarithm,  we 
have 


u  =  2  Im  [  z  — 


z2  z 3  z4 


2  Im  log(l  +  z)  =  2,0; 


(4.124) 
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Figure  4.11.  Geometric  construction  of  the  solution. 


where 

ib  =  tan-1  — — — 

1  +  x 

is  the  angle  that  the  line  passing  through  the  two  points  (x,  y)  and  (—1,0)  makes  with  the 
x-axis,  as  sketched  in  Figure  4.11.  You  should  try  to  convince  yourself  that,  on  the  unit 
circle,  2ip  =  9  has  the  correct  boundary  values.  Observe  that,  even  though  the  boundary 
values  are  discontinuous,  the  solution  is  an  analytic  function  inside  the  disk. 

In  fact,  unlike  the  rectangular  series  (4.98),  the  general  polar  series  solution  for¬ 
mula  (4.115)  can,  in  fact,  be  summed  in  closed  form!  If  we  substitute  the  explicit  Fourier 
formulae  (4.116)  into  (4.115)  —  remembering  to  change  the  integration  variable  to,  say,  p 
to  avoid  a  notational  conflict  —  we  obtain 


CL  ^ 

7x(r,  6)  =  ( an  rn  cos  n6  +  bnrn  sin  n  6 ) 


oo 


27 r 


n—  1 

>7 r  00 

h{4>)  d(j)+'^2 


-7T 


n  —  1  L 


rn  cos n 0  f*  .  .  rn  sin nO  r"' 

n{4>)  cos rnpacp  H - /  smncpdcp 


1  ^ 

7 r 


h((p) 


—  7 r 


7T 


oo 


-7T 


7 T 


■7T 


-  +  r"  ( cos  n  6  cos  n  4>  +  sin  n  6  sin  n  4> ) 


1.  ^ 
7 r 


—  7T 


1 


n—  1 
oo 


dp 


-  +  rn  cos  n  (0  —  0) 


n  =  1 


(4.125) 


dp. 


We  next  show  how  to  sum  the  final  series.  Using  (4.118),  we  can  write  it  as  the  real  part 
of  a  geometric  series: 


oo 


n  =  1 

=  Re 


77 

r  cos  n 


*=Re(Vf; 

V  n—  1 


£ 


n 


.  1  Z 

=  Re|  2+l^~z 


=  Re 


l  +  z 
2(1  -z) 


({ 1 +  *)(!- 

-  2?)  \  Re  (1  +  z  —  2?  - 

2? 

2)  _  1 

— 

2? 

2  1  —  r2 

V  2 

1-2? 

2  )  2 

1-2? 

2 

2 

1-2? 

2  2  (1  +  r2  —  2r  cos0) 

which  is  known  as  the  Poisson  kernel.  Substituting  back  into  (4.125)  establishes  the 
important  Poisson  Integral  Formula  for  the  solution  to  the  boundary  value  problem. 
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Figure  4.12.  Equilibrium  temperature  of  a  disk. 


Theorem  4.6.  The  solution  to  the  Laplace  equation  in  the  unit  disk  subject  to 
Dirichlet  boundary  conditions  u(l,9)  =  h{9 )  is 


u(r,  9) 


2t t  J_nh^  1  +  r2  -  2 r cos(<9  -  <j>) 


(4.126) 


Example  4.7.  A  uniform  metal  disk  of  unit  radius  has  half  of  its  circular  boundary 
held  at  1°,  while  the  other  half  is  held  at  0°.  Our  task  is  to  find  the  equilibrium  temperature 
u(x,y).  In  other  words,  we  seek  the  solution  to  the  Dirichlet  boundary  value  problem 


An  =  0. 


x2  +  y2  <  1 


u(x,y ) 


1. 

0, 


x2  +  y2  =  1,  y  >  0, 
x2  +  y2  =  1,  y  <  0, 


In  polar  coordinates,  the  boundary  data  is  a  (periodic)  step  function 


(4.127) 


H9) 


1,  0  <  9  <  7T, 

0,  —7 T  <  9  <  0. 


Therefore,  according  to  the  Poisson  formula  (4.126),  the  solution  is  given  byt 


*7 r 


u(r,  9)  = 


1  —  r 


27 t  J0  1  +  r2  —  2r  cos(9  —  (j>) 


d<fi  = 


'  1 

1 - tan 

7 r 

1 

<  2  ’ 

—  —  tan-1 
7 r 


-l 


1  —  r2 

2  r  sin  0 


1  —  r2 

2  r  sin  9 


0  <  9  <  7T. 


0  =  0,  i  7T, 


—  7T  <  0  <  0. 


(4.128) 


The  detailed  derivation  of  the  final  expressions  is  left  to  the  reader  as  Exercise  4.3.40. 
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1  1  1 

where  we  use  the  principal  branch  —  <  tan  £  <  of  the  inverse  tangent.  Revert¬ 
ing  to  rectangular  coordinates,  we  find  that  the  equilibrium  temperature  has  the  explicit 
formula 

1  —  x2  —  y2 


'  1 

1 - tan 

7 r 


-l 


2  y 


u{x,y)={  - 


1  _!  (  1  -  x2  -  y 

- tan 


7 r 


2  y 


x2+y2<  1,  y  >  0. 


x2  +  y2  <  1,  y  =  0. 


a;2  +  y2  <  1,  y  <  0, 


(4.129) 


The  result  is  depicted  in  Figure  4.12, 


Averaging ,  the  Maximum  Principle,  and  Analyticity 


Let  us  investigate  some  important  consequences  of  the  Poisson  integral  formula  (4.126). 
First,  setting  r  =  0  yields 

1  r 

—  /  h{4>)d(j) 

2  ^  J  —  IT 

The  left-hand  side  is  the  value  of  u  at  the  origin  —  the  center  of  the  disk  —  and  so 
independent  of  9 ;  the  right-hand  side  is  the  average  of  its  boundary  values  around  the  unit 
circle.  This  formula  is  a  particular  instance  of  an  important  general  fact. 

Theorem  4.8.  Let  u(x,y)  be  harmonic  inside  a  disk  of  radius  a  centered  at  a  point 
(,x‘0 ,  y{))  with  piecewise  continuous  (or,  more  generally ,  integrable )  boundary  values  on  the 
circle  C  =  {  (x  —  x0 )2  +  (y  —  y0 )2  =  a2  }.  Then  its  value  at  the  center  of  the  disk  is  equal 
to  the  average  of  its  values  on  the  boundary  circle: 


u(x0,y0)  =  — L  (f  uds  =  T  f 

Jc  2tt  J_ 

Proof :  One  approach  is  to  use  the  scaling  and  translation  symmetries  of  the  Laplace 
equation,  cf.  Exercises  4. 3. 8-9,  to  map  the  disk  of  radius  a  centered  at  (x0,y0)  to  the  unit 
disk  centered  at  the  origin,  and  then  invoke  (4.130).  Specifically,  we  set 


u(x0  +  a  cos#,  y0  +  asin.9)  dO.  (4.131) 


U(x,y)  =  u(x0 -\- ax,y0  +  ay).  (4.132) 

An  easy  chain  rule  computation  proves  that  U(x,y)  also  satisfies  the  Laplace  equation  on 
the  unit  disk  x1  +  y2  <  1,  with  boundary  values 


h{9)  =  U (cos  0,  sin  9)  =  u(x0  +  a  cos  0,  y0  +  a  sin  9) . 
Therefore,  by  (4.130), 


U(  0,0) 


—  [  h(9)  d9  =  —  [  [/(cos  0,  sin  9)  d9. 
27 t  J — Tj-  2ty  J — 7j- 


Replacing  U  by  its  formula  (4.132)  produces  the  desired  result  for  solutions  defined  by 
the  Poisson  integral  formula.  However,  it  is  not  a  priori  clear  that  all  solutions,  i.e., 
all  harmonic  functions,  are  necessarily  of  this  form.  This  will  follow  eventually  from  the 
Uniqueness  Theorem  4.10;  however,  its  proof  relies  on  formula  (4.131),  leading  to  a  circular 
argument. 
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A  better  proof,  which  does  not  rely  on  the  solution  formula  (4.115, 116),  is  the  follow¬ 
ing.  Given  the  harmonic  function  u{x,y),  consider  the  scalar  function 


*7 r 


9(a)  = 


2i r 


u(x0  +  a  cos  #,  y0  +  a  sin  9)  d6. 


-7T 


which  is  well  defined  for  a  >  0  sufficiently  small.  Since  u  E  C2,  we  can  calculate  the 
derivative  of  g  as  follows: 


*7 r 


g'  (a)  = 


2t r 

1 

2ira 


—  7 r  L 


cos#  —  (x0  Tacos#,  ?/0Tasin#)  T  sin#  — —  (x0  Tacos#,  y0  +  a  sin 9) 


dx 


dy 


d9 


c 


du 

dn 


ds. 


where  n  =  (cos  #,  sin  9)  defines  the  unit  normal  to  C  at  the  point  (x0  T  a  cos  #,  y0  T  a  sin  9) 
and  ds  =  a  d9  is  the  arc  length  element.  Letting  D  —  { (x  —  x0)2  T  (y  —  y0)2  <  a2}  denote 
the  disk  of  radius  a,  so  C  =  dD  is  its  boundary,  the  divergence  identity  (6.89),  which  is  an 
easy  consequence  of  Green’s  Theorem,  implies  that  the  latter  integral  equals 


c 


du 

dn 


ds  = 


A  u  dx  dy  =  0 


D 


because  u  is  harmonic.  Thus,  g* (a )  =  0  for  all  a  >  0  sufficiently  small,  which  implies 
g{a)  —  c  is  constant.  But  g(a)  represents  the  average  of  u(x,y)  on  the  circle  C  of  radius  a 
and  hence,  as  a  — 0,  the  average  g(a)  — u(x0,y0).  We  conclude  that  g(a)  =  u(x0,y0)  for 
all  a  >  0  such  that  u(x,y)  is  harmonic  in  the  disk  of  radius  a,  which  establishes  (4.131)  for 
all  such  harmonic  functions.  Q.E.D. 

An  important  consequence  of  the  integral  formula  (4.131)  is  the  Strong  Maximum 
Principle  for  harmonic  functions. 

Theorem  4.9.  Let  u  be  a  nonconstant  harmonic  function  defined  on  a  bounded 
domain  Q  and  continuous  on  dfl.  Then  u  achieves  its  maximum  and  minimum  values  only 
at  boundary  points  of  the  domain.  In  other  words ,  if 

m  =  min  {  u(x,  y )  |  (x,  y)  E  dfl  }  ,  M  —  max  {  u(x,  y)  |  (x,  y)  E  dfl  }, 

are,  respectively  its  maximum  and  minimum  values  on  the  boundary  then 


m  <  a(x,  y)  <  M  at  all  interior  points  (x,  y)  E  Q. 

Proof :  Let  M *  >  M  be  the  maximum  value  of  u  on  all  of  =  Q  U  <9fl,  and  assume 
a(x0,y0)  =  M*  at  some  interior  point  (x0,y0)  E  Q.  Theorem  4.8  implies  that  a(x0,y0) 
equals  its  average  over  any  circle  C  centered  at  (x0,  y0)  that  bounds  a  closed  disk  contained 
in  Q.  Since  u  is  continuous  and  <  on  C,  its  average  must  be  strictly  less  than  M* 
-  except  in  the  trivial  case  in  which  it  is  constant  and  equal  to  M *  on  all  of  C.  Thus, 
our  assumption  implies  that  u(x,y)  =  M *  =  u(x0,y0)  for  all  (x,y)  belonging  to  any 
circle  C  C  centered  at  (x0,y0).  Since  is  connected,  this  allows  us  to  conclude^  that 
u(x,y)  =  M*  is  constant  throughout  f 1,  in  contradiction  to  our  original  assumption. 

A  similar  argument  works  for  the  minimum;  alternatively,  one  can  interchange  maxi¬ 
mum  and  minimum  by  replacing  u  by  —  u.  Q.E.D. 


You  are  asked  to  supply  the  details  in  Exercise  4.3.42. 
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Physically,  if  we  interpret  u(x,  y )  as  the  vertical  displacement  of  a  membrane  stretched 
over  a  wire,  then  Theorem  4.9  says  that,  in  the  absence  of  external  forcing,  the  membrane 
cannot  have  any  internal  bumps  —  its  highest  and  lowest  points  are  necessarily  on  the 
boundary  of  the  domain.  This  reconfirms  our  physical  intuition:  the  restoring  force  exerted 
by  the  stretched  membrane  will  serve  to  flatten  any  bump,  and  hence  a  membrane  with  a 
local  maximum  or  minimum  cannot  be  in  equilibrium.  A  similar  interpretation  holds  for 
heat  conduction.  A  body  in  thermal  equilibrium  will  achieve  its  maximum  and  minimum 
temperature  only  at  boundary  points.  Indeed,  thermal  energy  would  flow  away  from  any 
internal  maximum,  or  towards  any  local  minimum,  and  so  if  the  body  contained  a  local 
maximum  or  minimum  in  its  interior,  it  could  not  remain  in  thermal  equilibrium. 

The  Maximum  Principle  immediately  implies  the  uniqueness  of  solutions  to  the  Dirich- 
let  boundary  value  problem  for  both  the  Laplace  and  Poisson  equations: 

Theorem  4.10.  Ifu  and  u  both  satisfy  the  same  Poisson  equation  —  A u  —  f  —  —  A u 
within  a  bounded  domain  and  u  =  u  on  dfl,  then  u  =  u  throughout  Q. 

Proof:  By  linearity,  the  difference  v  —  u  —  u  satisfies  the  homogeneous  boundary  value 
problem  Av  =  0  in  and  v  =  0  on  dfl.  Our  assumption  implies  that  the  maximum  and 
minimum  boundary  values  of  v  are  both  0  —  m  —  M .  Theorem  4.9  implies  that  v(x,y)  =  0 
at  all  (x,y)  E  O,  and  hence  u  =  u  everywhere  in  Q.  Q.E.D. 

Finally,  let  us  discuss  the  analyticity  of  harmonic  functions.  In  view  of  (4.119),  the 
nth  order  term  in  the  polar  series  solution  (4.115),  namely, 


anrn  cos nO  +  bnrn  sin nQ  =  anRe  zn  +  6nIm  zn  =  Re  [  (an  —  \bn) zn 

is,  in  fact,  a  homogeneous  polynomial  in  (x,  y)  of  degree  n.  This  means  that,  when  written 
in  rectangular  coordinates  x  and  y,  (4.115)  is,  in  fact,  a  power  series  for  the  harmonic 
function  u(x,y).  It  is  well  known,  [8,  23,  97],  that  any  convergent  power  series  converges 
to  an  analytic  function  —  in  this  case  u(x,  y).  Moreover,  the  power  series  must,  in  fact,  be 
the  Taylor  series  for  u(x,y)  based  at  the  origin,  and  so  its  coefficients  are  multiples  of  the 
derivatives  of  u  at  x  =  y  =  0.  Details  are  worked  out  in  Exercise  4.3.49. 

We  can  adapt  this  argument  to  prove  analyticity  of  all  solutions  to  the  Laplace  equa¬ 
tion.  Note  especially  the  contrast  with  the  wave  equation,  which  has  many  non-analytic 
solutions. 


Theorem  4.11.  A  harmonic  function  is  analytic  at  every  point  in  the  interior  of  its 
domain  of  dehnition. 


Proof:  Let  u(x,  y)  be  a  solution  to  the  Laplace  equation  on  the  open  domain  O  C  M2. 
Let  x0  =  (x0,2/0)  E  O,  and  choose  a  >  0  such  that  the  closed  disk  of  radius  a  centered  at 
x0  is  entirely  contained  within  Q: 


A»(xo)  =  {llx-xoll  <a}ctt, 

where  ||  •  ||  is  the  usual  Euclidean  norm.  Then  the  function  U(x,y)  defined  by  (4.132)  is 
harmonic  on  the  unit  disk,  with  well-defined  boundary  values.  Thus,  by  the  preceding 
remarks,  U(x,y)  is  analytic  at  every  point  inside  the  unit  disk,  and  hence  so  is 


f  ^  tt  f x ~ xo  y-vo\ 
u{x,  y)  =  U  - -  , - - 

\  a  a  J 

at  every  point  (x,y)  in  the  interior  of  the  disk  Da(x0).  Since  x0  E  Q  was  arbitrary,  this 
establishes  the  analyticity  of  u  throughout  the  domain.  Q.E.D. 
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This  concludes  our  discussion  of  the  method  of  separation  of  variables  for  the  planar 
Laplace  equation  and  some  of  its  important  consequences.  The  method  can  be  used  in  a 
few  other  special  coordinate  systems.  See  [78,  79]  for  a  complete  account,  including  the 
fascinating  connections  with  the  underlying  symmetry  properties  of  the  equation. 


Exercises 


4.3.22.  Solve  the  following  Euler  differential  equations  by  use  of  the  power  ansatz: 


(a)  x2  u"  +  5  xu  —  5u  =  0,  (b)  2  x2  u" 


xu 


2  u  =  0,  (c)  x2  u 


n 


d2u 
dx 2 


+  - 


u  =  0 
2  du 


x  dx 


=  0. 


(d)  x2  u"  +  xu  —  3u  =  0,  (e)  3x2  un  —  bxu  —  3u  =  0,  (f) 

0  4.3.23.  (4)  Show  that  if  u{x)  solves  the  Euler  differential  equation 

2  d2u  du 

ax  -—^-\-bx- — b  cu  =  0, 
dxz  dx 

then  v(y)  =  u(ey)  solves  a  linear  constant-coefficient  differential  equation. 

(ii)  Use  this  technique  to  solve  the  Euler  differential  equations  in  Exercise  4.3.22. 


(4.133) 


4.3.24.  (a)  Use  the  method  in  Exercise  4.3.23  to  solve  an  Euler  equation  whose  characteristic 
equation  has  a  double  root  r1  =  r2  =  r.  (b)  Solve  the  specific  equations 


/  -  \  2  //  /  i  n 

(i)  x  u  —  xu  +  u  =  0. 


o 

( d  u  t  1  du 

Ury.2  E  ~JZ 

\~AjtAy  tl/ 


0. 


4.3.25.  Solve  the  following  boundary  value  problems: 

(a)  A u  =  0,  x2  +  y2  <  1,  u  =  x3,  x2  +  y2  =  1; 

(b)  A u  =  0,  x2  +  y2  <  2,  u  —  log(x2  +  y2),  x2  +  y2  =  1: 


(c)  At^  =  0,  x2  +  y2  <  4, 

(d)  Aix  =  0,  x2  +  y2  <  1, 


4  2.2 

u  —  x  ,  x  +  y  - 

du  2.2 

-—  =  X,  X  +  y 

a  n 


1. 


2  2 

4.3.26.  Let  u(x,y)  be  the  solution  to  the  boundary  value  problem  uxx  +  u  =0,  x  +  y  <  1, 


u(x,y) 


X 


2  .  2 
x  +  y 


1.  Find  a(0,  0). 


V  4.3.27.  (a)  Find  the  equilibrium  temperature  on  a  disk  of  radius  1  when  half  the  boundary  is 
held  at  1°  and  the  other  half  is  held  at  —1°.  (b)  Find  the  equilibrium  temperature  on  a 

half-disk  of  radius  1  when  the  temperature  is  held  to  1°  on  the  curved  edge  and  0°  on  the 
straight  edge,  (c)  Find  the  equilibrium  temperature  on  a  half  disk  of  radius  1  when  the 
temperature  is  held  to  0°  on  the  curved  edge  and  1°  on  the  straight  edge. 


4.3.28.  Find  the  solution  to  Laplace’s  equation  uxx  +  u  =  0  on  the  semi-disk  x2  +  y2  <  L 


Q 

y  >  0,  that  satisfies  the  boundary  conditions  u(x,  0)  =  0  for  —  1  <  x  <  1  and  u(x,  y)  =  y 

i,  y  >  0. 


r  2,2 

lor  x  +  y 


4.3.29.  Find  the  equilibrium  temperature  on  a  half-disk  of  radius  1  when  the  temperature  is 
held  to  1°  on  the  curved  edge,  while  the  straight  edge  is  insulated. 


4.3.30.  Solve  the  Dirichlet  boundary  value  problem  for  the  Laplace  equation  on  the  pie  wedge 
VF  =  {0  <  0  <  \  7T,  0<r<l},  when  the  nonzero  boundary  data  u(  1,  0)  =  h(0)  appears 
only  on  the  curved  portion  of  its  boundary. 

4.3.31.  Find  a  harmonic  function  u(x,y)  defined  on  the  annulus  ^  <  r  <  1  subject  to  the 
constant  Dirichlet  boundary  conditions  u  =  a  on  r  =  ^  and  u  =  b  on  r  =  1. 

4.3.32.  Boiling  water  flows  continually  through  a  long  circular  metal  pipe  of  inner  radius  1  cm 
and  outer  radius  1.2  cm  placed  in  an  ice  water  bath.  True  or  false:  The  temperature  at  the 
midpoint,  at  radius  1.1  cm,  is  50°.  If  false,  what  is  the  temperature  at  this  point? 
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4.3.33.  Write  out  the  series  solution  to  the  boundary  value  problem  u(l,9)  =0,  u( 2,  0)  =  h(0), 
for  the  Laplace  equation  on  an  annulus  1  <  r  <  2.  Hint :  Use  all  of  the  separable  solutions 
listed  in  (4.114). 

4.3.34.  Solve  the  following  boundary  value  problems  for  the  Laplace  equation  on  the  annulus 

1  <  r  <  2:  (a)  u(l,9)=0,  u(2,0)  =  l,  (b)  u(l,0)  =  O,  a(2,  #)  =  cos  #, 

(c)  u(l,  0)  =  sin 20,  u(2,  0)  =  cos  20,  (d)  ur(l,9)  =  0,  u(2,0)  =  1, 

(e)  ur(l,0)  =  O,  a(2,  #)  =  sin2$,  (f)  ur(l,0)  =  O,  ur(2,9)  =  l, 

(g)  ur(l,  0)  —  2,  ur(2,  0)  —  1. 

4.3.35.  Solve  the  following  boundary  value  problems  for  the  Laplace  equation  on  the  semi- 
annular  domain  D  =  { 1  <  x2  +  y2  <  2,  y  >  0}: 

(a)  a(x,  g)  =  0,  x2  +  y2  =  1,  a(x,  g)  =  1,  x2  +  y2  =  2,  a(x,  0)  =  0; 

(b)  a(x,  g)  =  0,  x2  +  y2  =  1  or  2,  a(x,  0)  =  0,  x  >  0,  a(x,  0)  =  1,  x  <  0. 

4.3.36.  Solve  the  following  boundary  value  problem: 

(. x 2  +  y2)(uxx  +  uyy)  +  2 xux  +  2 yuy  =  0,  x2  +  y2  <  1,  u(x,  y)  =  1  +  3x,  x2  +  y2  =  1. 

0  4.3.37.  Justify  the  chain  rule  computation  (4.104).  Then  justify  formula  (4.105)  for  the  Lapla- 
cian  in  polar  coordinates. 

/7T 

|  h(0)  |  dO  <  oo.  Prove  that  (4.115)  converges  uniformly  to  the  solution  to 

-7 r 

the  boundary  value  problem  (4.101)  on  any  smaller  disk  Dr*  =  {r<r^<  1}  C  D1. 

4.3.39.  Prove  directly  that  (4.124)  satisfies  the  boundary  conditions  (4.122). 

4.3.40.  Justify  the  integration  formula  in  (4.128). 

4.3.41.  Provide  a  complete  proof  that  (4.129)  is  indeed  the  solution  to  the  boundary  value 
problem  (4.127). 

4.3.42.  Complete  the  proof  of  Theorem  4.9  by  showing  that  u(x,  y)  =  M *  for  all  (x,  y)  G  Ft. 
Hint:  Join  (x0,g0)  to  (x,y)  by  a  curve  C  C  9  of  finite  length,  and  use  the  preceding  part 
of  the  proof  to  inductively  deduce  the  existence  of  a  finite  sequence  of  points  (xi,yi)  G  C, 
i  =  0, . . . ,  n,  with  (xn,yn)  =  (x,  y),  and  such  that  u(x^  gj  =  M*. 

0  4.3.43.  Derive  the  analogue  of  the  Poisson  integral  formula  for  the  solution  to  the  Neumann 

boundary  value  problem  A u  =  0,  x2  +  y2  <  1,  du/d n  =  h,  x2  +  y2  =  1,  on  the  unit  disk. 
Pay  careful  attention  to  the  existence  and  uniqueness  of  solutions  in  your  formulation. 

4.3.44.  Give  an  example  of  a  solution  to  Poisson’s  equation  on  the  unit  disk  that  achieves  its 
maximum  at  an  interior  point.  Interpret  your  construction  physically. 

4.3.45.  Let  p(x,y)  be  a  polynomial  (not  necessarily  harmonic).  Suppose  u(x,y)  is  harmonic 

r\  r\ 

and  equals  p(x,  y)  on  the  unit  circle  x  +  y  =1.  Prove  that  u(x,  y)  is  a  harmonic  polyno¬ 
mial. 

4.3.46.  Write  down  an  integral  formula  for  the  solution  to  the  Dirichlet  boundary  value  prob¬ 
lem  on  a  disk  of  radius  R  >  0,  namely,  A u  =  0,  x2  +  y2  <  R2 ,  u  =  h,  x2  +  y2  =  R2 . 

4.3.47.  State  and  prove  a  one-dimensional  version  of  Theorem  4.8.  Does  the  analogue  of 
Theorem  4.9  hold? 

4.3.48.  A  unit  area  square  plate  has  100°  temperature  on  its  top  edge  and  0°  on  its  three  other 
edges.  True  or  false:  The  temperature  at  the  center  equals  the  average  edge  temperature. 

^  4.3.49.  Let  u(x,  y)  be  a  harmonic  function  on  the  unit  disk  with  boundary  values  h{0)  when 
r  =  1.  Using  the  fact  that  (4.115)  is  the  Taylor  series  for  u(x,y)  at  the  origin:  (a)  Find 
integral  formulas  for  its  partial  derivatives  w  (0,  0),  u  (0,0),  involving  the  boundary  values 

h{6).  (b)  Generalize  part  (a)  to  the  second-order  derivatives  uxx( 0,0),  ux  ( 0,0),  u  (0,0). 
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4.3.50.  Prove  that  if  u(pc,  y )  is  a  bounded  harmonic  function  defined  on  all  of  R2,  then  u  is  con¬ 
stant.  Hint:  First  generalize  Exercise  4.3.49(a)  to  find  the  value  of  its  gradient,  Va(x0,y0), 
in  terms  of  the  values  of  a  on  a  circle  of  radius  a  centered  at  (x0,  y0).  Then  see  what  hap¬ 
pens  when  the  radius  of  the  circle  goes  to  oo. 


4.4  Classification  of  Linear  Partial  Differential  Equations 


We  have,  at  last,  been  introduced  to  the  three  paradigmatic  linear  second-order  partial 
differential  equations  for  functions  of  two  variables.  The  homogeneous  versions  are 

(a)  The  wave  equation:  utt  —  c  uxx  =  0,  hyperbolic , 

(b)  The  heat  equation:  ut  —  7  uxx  =  0,  parabolic , 

(c)  Laplace’s  equation:  uxx  +  u  =  0,  elliptic. 

The  last  column  indicates  the  equation’s  type,  in  accordance  with  the  standard  taxonomy 
of  partial  differential  equations;  an  explanation  will  appear  momentarily.  The  wave,  heat, 
and  Laplace  equations  are  the  prototypical  representatives  of  these  three  fundamental  gen¬ 
res.  Each  genre  has  its  own  distinctive  analytic  features,  physical  manifestations,  and  even 
numerical  solution  schemes.  Equations  governing  vibrations,  such  as  the  wave  equation, 
are  typically  hyperbolic.  Equations  modeling  diffusion,  such  as  the  heat  equation,  are 
parabolic.  Hyperbolic  and  parabolic  equations  both  typically  represent  dynamical  pro¬ 
cesses,  and  so  one  of  the  independent  variables  is  identified  as  time.  On  the  other  hand, 
equations  modeling  equilibrium  phenomena,  including  the  Laplace  and  Poisson  equations, 
are  usually  elliptic,  and  involve  only  spatial  variables.  Elliptic  partial  differential  equations 
are  associated  with  boundary  value  problems,  whereas  parabolic  and  hyperbolic  equations 
require  initial  and  initial-boundary  value  problems. 

The  classification  theory  of  real  linear  second-order  partial  differential  equations  for  a 
scalar- valued  function  u(t,  x )  depending  on  two  variables^  proceeds  as  follows.  The  most 
general  such  equation  has  the  form 


—  Autt  4-  Butx  V  Cuxx  D  U-f.  T  Eux  +  Fu  —  G , 


(4.134) 


where  the  coefficients  A,  B,C,  D,  E,  F  are  all  allowed  to  be  functions  of  (t,x),  as  is  the 
inhomogeneity  or  forcing  function  G(t,x).  The  equation  is  homogeneous  if  and  only  if 
G  =  0.  We  assume  that  at  least  one  of  the  leading  coefficients  A,  B,C  is  not  identically 
zero,  since  otherwise,  the  equation  degenerates  to  a  first-order  equation. 

The  key  quantity  that  determines  the  type  of  such  a  partial  differential  equation  is  its 
discriminant 

A  =  B2  -  4 AC.  (4.135) 

This  should  (and  for  good  reason)  remind  the  reader  of  the  discriminant  of  the  quadratic 
equation 

Q(x,y)  =  Ax2  +  B  x  y  +  C  y2  +  D  x  +  E  y  +  F  =  0 ,  (4.136) 


whose  solutions  trace  out  a  plane  curve  —  a  conic  section.  In  the  nondegenerate  cases,  the 
discriminant  (4.135)  fixes  its  geometric  type: 


For  equilibrium  equations,  we  identify  t  with  the  space  variable  y. 
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•  a  hyperbola  when  A  >  0, 

•  a  parabola  when  A  =  0, 

•  an  ellipse  when  A  <  0. 

This  motivates  the  choice  of  terminology  used  to  classify  second-order  partial  differential 
equations. 


Definition  4.12.  At  a  point  (£,x),  the  linear  second-order  partial  differential  equa¬ 
tion  (4.134)  is  called 

•  hyperbolic  if  A (t,x)  >  0, 

•  parabolic  if  A(£,  x)  =  0,  but  A2  +  B2  +  C2  ^  0, 

•  elliptic  if  A(£,  x)  <  0, 

•  singular  if  A  —  B  —  C  —  0. 


In  particular: 

•  The  wave  equation  utt  —  uxx  =  0  has  discriminant  A  =  4,  and  is  hyperbolic. 

•  The  heat  equation  uxx  —  ut  —  0  has  discriminant  A  =  0,  and  is  parabolic. 

•  The  Poisson  equation  utt  +  uxx  —  —  f  has  discriminant  A  =  —  4,  and  is  elliptic. 


Example  4.13.  When  the  coefficients  A,  B ,  C  vary,  the  type  of  the  partial  differential 
equation  may  not  remain  fixed  over  the  entire  domain.  Equations  that  change  type  are 
less  common,  as  well  as  being  much  harder  to  analyze  and  solve,  both  analytically  and 
numerically.  One  example  arising  in  the  theory  of  supersonic  aerodynamics,  [44],  is  the 
Tricomi  equation 


d2u  d2u 
dt 2  dx 2 


(4.137) 


Comparing  with  (4.134),  we  fold  that 


A  =  x,  B  —  0,  C  —  —1,  while  D  =  E  =  F  =  G  =  0. 
The  discriminant  in  this  particular  case  is 

A  =  B2  -  4  AC  =  Ax, 


and  hence  the  equation  is  hyperbolic  when  x  >  0,  elliptic  when  x  <  0,  and  parabolic  on  the 
transition  line  x  =  0.  In  the  physical  model,  the  hyperbolic  region  corresponds  to  subsonic 
flow,  while  the  supersonic  regions  are  of  elliptic  type.  The  transitional  parabolic  boundary 
represents  the  shock  line  between  the  sub-  and  super-sonic  regions  —  the  familiar  sonic 
boom  as  an  airplane  crosses  the  sound  barrier. 


While  this  tripartite  classification  into  hyperbolic,  parabolic,  and  elliptic  equations 
initially  appears  in  the  bivariate  context,  the  terminology,  underlying  properties,  and  as¬ 
sociated  physical  models  carry  over  to  second-order  partial  differential  equations  in  higher 
dimensions.  Most  of  the  partial  differential  equations  arising  in  applications  fall  into  one 
of  these  three  categories,  and  it  is  fair  to  say  that  the  held  of  partial  differential  equations 
splits  into  three  distinct  subfields.  Or  rather  four  subfields,  the  last  containing  all  the  equa¬ 
tions,  including  higher-order  equations,  that  do  not  fit  into  the  preceding  categorization. 
(One  important  example  appears  in  Section  8.5.) 
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Remark :  The  classification  into  hyperbolic,  parabolic,  elliptic,  and  singular  types  car¬ 
ries  over  as  stated  to  quasilinear  second-order  equations,  whose  coefficients  A, . . . ,  G  are 
allowed  to  depend  on  u  and  its  first-order  derivatives,  ut:ux.  Here  the  type  of  the  equation 
can  vary  with  both  the  point  in  the  domain  and  the  particular  solution  being  considered. 
Even  more  generally,  for  a  fully  nonlinear  second-order  partial  differential  equation 


H(t1x1u^utluxluttlutxluxx)  0. 


one  defines  its  discriminant  to  be 


dll  OH 

dutt  duxx  ' 


(4.138) 


(4.139) 


Its  sign  determines  the  type  of  the  equation  as  above  —  again  depending  on  the  point  in 
the  domain  and  the  solution  under  consideration. 


Exercises 

4.4.1.  Plot  the  following  conic  sections  and  classify  their  type: 

(a)  x2  +  3y2  =  l,  (b)  xy-j-x-hy  =  4,  (c)  x2  -  xy  +  y2  =  x  -  2y, 

(d)  x2  +  2xy  +  y2  +  y  =  1,  (e)  x2  —  2y2  =  6x  +  8y  +  1. 

4.4.2.  Determine  the  type  of  the  following  partial  differential  equations: 

(a)  utt  +  3 uxx  =  0,  (b)  utx  +  ut  +  ux  =  u,  (c)  utt  +  ut  +  ux  =  0, 

(d)  utt  -  utx  +  uxx  =  a,  (e)  uu  +  4utx  +  4 uxx  =  ut,  (f)  utx  +  uxx  =  0. 

4.4.3.  Consider  the  partial  differential  equation  xutt  +  (t  +  x)uxx  =  0.  At  what  points  of  the 
plane  is  the  equation  elliptic?  hyperbolic?  parabolic?  degenerate? 

4.4.4.  Answer  Exercise  4.4.3  for  the  equations 

(a)  x2  uxx  +  x  ux  +  uyy  =  °’  (b)  dx  (x  ux)  =  dy(V  uy)’  (c)  ut  =  dx  [  (x  +  t)ux  ]  - 

(d)  V-  (c(x,y)Va)  =  a,  where  c(x,y)  is  a  given  function. 

4.4.5.  Steady  flow  of  air  past  an  airplane  is  modeled  by  the  partial  differential  equation 
(m2  —  l)uxx  +  uyy  =  0,  in  which  x  is  the  flight  direction,  y  the  transverse  direction,  and 
m  >  0  is  the  Mach  number  —  the  ratio  of  the  airplane’s  speed  to  the  speed  of  sound.  Show 
that  the  equation  is  hyperbolic  for  subsonic  flight,  but  elliptic  for  supersonic  flight. 

4.4.6.  Show  that  the  second-order  partial  differential  equation 

d  (  ,  .da 

is  elliptic  if  and  only  if  p(x,  y)  and  q(x,  y)  are  nonzero  and  have  the  same  sign. 

^  4.4.7.  True  or  false:  The  type  of  a  linear  second-order  partial  differential  equation  is  not  af¬ 
fected  by  a  change  of  independent  variables:  r  =  <p(t,  x),  t;  =  x). 

4.4.8.  Let  v(t,  x)  =  a(t,  x)  a(t,  x)  +  b(t,  x),  where  a,  b  are  fixed  functions  with  a  /  0.  Suppose  a 
is  a  solution  to  a  second-order  linear  partial  differential  equation.  Prove  that  v  also  solves  a 
linear  partial  differential  equation  of  the  same  type. 

4.4.9.  True  or  false:  The  polar  coordinate  form  (4.105)  of  the  Laplace  equation  is  elliptic. 

4.4.10.  Rewrite  the  Laplace  equation  uxx  +  uyy  =  0  in  terms  of  parabolic  coordinates  £,  rp  as 
defined  by  the  equations  x  =  £2  —  t?2,  y  =  2£>r).  Is  the  resulting  equation  elliptic? 


)  +r(x,y)u  =  f(x,y) 
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<0  4.4.11.  Prove  that  the  complex  change  of  variables  x  =  x,  t  =  i  y,  maps  the  Laplace  equation 
uxx  +  uyy  —  0  to  the  wave  equation  utt  =  uxx .  Explain  why  the  type  of  a  partial  differential 
equation  is  not  necessarily  preserved  under  a  complex  change  of  variables. 

C  4.4.12.  Suppose,  against  all  advice,  we  pose  the  elliptic  Laplace  equation  as  an  initial  value 
problem,  namely  utt  =  -uxx  for  0  <  x  <  1,  t  >  0, 

u( 0,  x )  =  /(#),  ut( 0,  x)  =  0,  0  <  x  <  1,  u(t,  0)  =  0  =  u(t,  1),  t  >  0. 

/  \  tj  , t  r  ...  .  ,  ^  „  ,,  r  /,  \  sinn7rt  coshn7rx 

(a)  Prove  that  for  any  positive  integer  n  >  0,  the  function  un(t,x)  =  - — - 

satisfies  the  initial  value  problem.  Determine  the  initial  condition  un{ 0,  x)  =  fn(x). 

(b)  Prove  that,  as  n  oo,  the  initial  condition  fn(x)  — 0  becomes  vanishingly  small, 

whereas,  at  any  t  >  0,  the  solution  value  un(t,  oo. 

(c)  Explain  why  this  represents  an  ill-posed  problem. 

4.4.13.  The  minimal  surface  equation  (l+ux)uxx—2uxuyuxy  +  (l+uy)uyy  =  0  is  (a)  hyperbolic, 

(b)  parabolic,  (c)  elliptic,  (d)  singular,  (e)  of  variable  type  depending  on  the  point  in  the 
domain,  or  (f)  of  variable  type  depending  on  the  solution  and  the  point  in  the  domain. 


Characteristics  and  the  Cauchy  Problem 

In  Chapter  2,  we  discovered  that  the  characteristic  curves  guide  the  behavior  of  solutions 
to  first-order  partial  differential  equations.  Characteristics  play  a  similarly  fundamental 
role  in  the  analysis  of  more  general  hyperbolic  partial  differential  equations  and  systems. 
In  particular,  they  provide  a  mechanism  for  distinguishing  among  the  various  classes  of 
second-order  partial  differential  equations. 

As  above,  we  will  focus  our  attention  on  partial  differential  equations  involving  two 
independent  variables.  The  starting  point  is  the  general  initial  value  problem,  also  known 
as  the  Cauchy  problem,  in  honor  of  the  prolific  nineteenth-century  French  mathemati¬ 
cian  Augustin-Louis  Cauchy,  justly  famous  for  his  wide-ranging  contributions  throughout 
mathematics  and  its  applications,  including  the  Cauchy-Schwarz  inequality,  many  of  the 
fundamental  concepts  in  complex  analysis,  as  well  as  the  foundations  of  elasticity  and 
materials  science.  The  general  Cauchy  problem  specifies  appropriate  initial  data  along  a 
smooth  curved  T  C  M2  and  seeks  a  solution  to  the  partial  differential  equation  that  as¬ 
sumes  the  given  initial  data  on  T.  In  all  our  examples,  the  curve  in  question  has  been  a 
straight  line,  e.g.,  the  x-axis,  but  one  could  easily  envisage  more  general  situations.  If  the 
partial  differential  equation  has  order  n,  then  the  Cauchy  data  consists  of  the  values  of  the 
dependent  variable  u  along  with  all  its  partial  differential  equations  up  to  order  n  —  1  on 
the  curve  T.  For  most  curves,  there  is  a  unique  solution  u(t,  x )  to  the  partial  differential 
equation  that  achieves  the  specified  values  along  T.  More  rigorously,  if  we  are  in  the  an¬ 
alytic  category,  meaning  that  the  partial  differential  equation,  the  curve,  and  the  Cauchy 
data  are  all  specified  by  analytic  functions,  then  the  fundamental  Cauchy-Kovalevskaya 
Theorem  guarantees  the  existence  of  an  analytic  solution  u(t,  x )  to  the  Cauchy  problem 
near  any  point  on  the  initial  curve.  The  statement  of  proof  of  this  important  theorem,  due 
to  Cauchy  and,  in  general  form,  the  influential  nineteenth-century  Russian  mathematician 
Sofia  Kovalevskaya,  relies  on  the  construction  of  convergent  power  series  for  the  desired 


^  More  generally,  for  partial  differential  equations  in  m  >  2  independent  variables,  the  curve 
is  replaced  by  a  hypersurface  S  C  of  dimension  m  —  1. 
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solution  and  would  take  us  too  far  afield.  We  refer  the  interested  reader  to  [35,  44].  The 
exceptional  curves,  for  which  the  Cauchy-Kovalevskaya  Existence  Theorem  does  not  apply, 
are  called  the  characteristics  of  the  underlying  partial  differential  equations. 

More  prosaically,  a  curve  T  will  be  called  non- characteristic  for  the  given  partial 
differential  equation  if  one  can  determine  the  values  of  all  the  derivatives  of  u  along  T 
from  the  specified  Cauchy  data.  Indeed,  the  determination  of  the  values  of  the  higher- 
order  derivatives  along  the  curve  is  a  necessary  preliminary  step  towards  establishing  the 
Cauchy-Kovalevskaya  existence  result.  As  we  will  now  show,  this  requirement  serves  to  dis¬ 
tinguish  the  characteristic  and  non-characteristic  curves  for  the  examples  we  have  already 
encountered,  and  hence  to  lead  to  their  characterization  in  much  more  general  contexts. 

To  illustrate  the  preceding  requirement,  let  us  begin  with  a  first-order  linear  partial 
differential  equation  of  the  form 


du  du 

+  c(t,x)  —  =  f(t,x) 


(4.140) 


dt  dx 

Let  T  C  M2  be  a  smooth  curve  parametrized*  by  x  (s)  =  (t(s)  ,  x(s)  )T,  where  smoothness 
necessitates  that  its  tangent  vector  not  vanish:  x'(s)  =  (dt/ds,  dx/ds)T  ^  0.  Since  the 
equation  is  of  order  n  —  1,  the  Cauchy  data  requires  specifying  the  values  of  the  dependent 
variable  u  only  along  T  —  in  other  words,  the  function 

h(s)  =  u(t(s),  x(s)) .  (4.141) 

The  curve  will  be  non-characteristic  if  we  can  then  determine  the  values  of  the  derivatives 
of  u  along  T,  starting  with 


du 


(t(s),x(s)), 


du 


(t(s),x(s)). 


(4.142) 


dt  v  7  dx 

To  this  end,  let  us  differentiate  the  Cauchy  data  (4.141):  applying  the  chain  rule,  we  obtain 


h'{s )  =  4 :u(t(s),x(s ))  =  ^(t(s),x(s))  +  ^z(t(s),x{s))  dX 


(4.143) 


ds  v  v  7  7  v  7  7  dt  v  7  ds  dxK  v  77  v  7  7  ds 

On  the  other  hand,  we  are  assuming  that  u(t,  x)  solves  the  partial  differential  equation 
(4.140)  at  all  points  in  its  domain  of  definition.  In  particular,  at  points  on  the  curve  T,  the 
partial  differential  equation  requires 


du  du 

(t(s),x(s))  +  c(t(s),x(s))  —(t(s),x(s))  =  f  (t(s) ,  x(s)) . 


(4.144) 


dt  v  7  v  7  dx 

We  can  regard  (4.143-144)  as  a  pair  of  inhomogeneous  linear  algebraic  equations,  which 
can  be  uniquely  solved  for  the  as  yet  unknown  quantities  (4.142),  unless  the  determinant 
of  their  coefficient  matrix  vanishes: 


det  (  ) 

\dt/ds 


c(t(s),x(s))\  dx  ,  X 

dz/d,  )  =  Ts~  c(*M-  ^ 


dt 

ds 


=  0. 


(4.145) 


This  condition  serves  to  define  a  characteristic  curve  for  the  first-order  partial  differential 
equation  (4.140).  In  particular,  if  the  curve  is  parametrized  by  s  =  £,  i.e.,  can  be  identified 
with  the  graph  of  a  function  x  =  g(t),  then  the  characteristic  condition  (4.145)  reduces  to 


dx 

dt 


=  c(t,  x) 


(4.146) 


The  parameter  s  could  be  the  arc  length,  but  this  is  not  required.  See  also  Exercise  4.4.20. 
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thus  reproducing  our  original  definition  of  characteristic  curve,  as  in  (2.18)  and,  more 
generally,  Exercise  2.2.26.  On  the  other  hand,  if  the  determinant  (4.145)  is  nonzero,  then 
one  can  solve  (4.143-144)  for  the  values  of  the  first-order  derivatives  (4.142)  along  T. 
Further  differentiation  of  these  conditions  proves  that  one  can,  in  fact,  determine  the 
values  of  all  the  higher-order  derivatives  of  the  solution  u  along  the  curve,  which  is  hence 
non-characteristic . 

Next,  consider  a  nonsingular  linear  second-order  partial  differential  equation  of  the 
form  (4.134).  Since  the  equation  has  order  n  —  2,  the  Cauchy  data  along  a  curve  T 
parametrized  as  above  consists  of  the  values  of  the  function  and  its  first  derivatives: 

chi  chi 

u(t(s),x(s)),  —(t(s),x(s)),  —(t(s),x(s)).  (4.147) 

However,  the  latter  cannot  be  specified  independently.  Indeed,  given  the  value  of  the 
dependent  variable,  h(s)  =  u(t(s),  x(s)) ,  along  T,  its  derivative 

h'(s)  =  /fs  u(t{s),x(s))  =  ^(t(s),x(s))  £  +  ^(t(s),x(s))  ^  (4.148) 

prescribes  a  particular  combination  of  the  two  first-order  derivatives.  Thus,  once  the 
value  of  one  derivative  of  u  on  F  is  known,  the  other  is  automatically  fixed  by  the  relation 
(4.148).  For  example,  if  dx/ds  ^  0,  we  can  use  (4.148)  to  determine  ux  (t(s),  x(s)) ,  knowing 
u(t(s),  x(s))  and  ut  (t(s),  x(s)) .  Similarly,  if  we  differentiate  the  values  of  the  first-order 
derivatives  with  respect  to  the  curve  parameter,  we  can  determine  two  combinations  of 
second-order  derivatives  along  the  curve  F : 


d  du 

ds  dt 
d  du 

ds  dx 


(t{s),x{s)) 

(t(s),x(s)) 


^(t(s)  x(s))-  + 
d2u 


d2 


u 


ds 


dt  dx 


(t{s),x{s)) 


dt 


+ 


dt  dx 
d2u 


ds  dx2 


{t{s),x{s)) 

(t(s),x(s)) 


dx 

ds  ’ 
dx 

ds 


(4.149) 


On  the  other  hand,  the  partial  differential  equation  (4.134)  induces  yet  a  third  relation 
among  the  second-order  partial  derivatives  utt,utx,uxx.  These  three  linear  equations  can 
be  uniquely  solved  for  values  of  these  derivatives  on  T  if  and  only  if  the  determinant  of 
their  coefficient  matrix  is  nonzero: 


/  A{t,x)  B(t,x)  C(t,x)\ 
det  \  dt/ds  dx/ds  0 

\  0  dt/ds  dx/ds  ) 


_  7  .  dt  dx  _ 

B(t,  x)  —  —  +  C(t,  x) 
ds  ds 


0. 


(4.150) 


We  conclude  that  a  smooth  curve  x(s)  =  ( t(s),  x{s)  )T  C  M2  is  a  characteristic  curve 
for  the  nonsingular  linear  second-order  partial  differential  equation  (4.134)  whenever  its 
tangent  vector  xr(s)  =  (dt/ds,  dx/ds)T  ^  0  satisfies  the  quadratic  characteristic  equation 
(4.150).  Conversely,  if  the  curve  is  non-characteristic,  meaning  that  its  tangent  does  not 
satisfy  (4.150)  anywhere,  then  one  can,  with  some  further  work,  determine  all  the  higher- 
order  derivatives  of  the  solution  u(t,  x)  along  T,  and  then,  at  least  in  the  analytic  category, 
prove  existence  of  a  solution  to  the  Cauchy  problem,  [35  . 

According  to  Exercise  4.4.20,  the  status  of  a  curve  as  characteristic  or  not  does  not 
depend  on  the  choice  of  parametrization.  In  particular,  if  the  curve  is  given  by  the  graph 
of  the  function  x  =  x(t),  which  we  parametrize  by  s  =  £,  then  the  characteristic  equation 
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(4.150)  takes  the  form  of  a  quadratically  nonlinear  first-order  ordinary  differential  equation 

A(t,x)  (^)  -  B(t,x)  ^ +  C(t,x)  =  0,  (4.151) 

whose  solutions  are  characteristic  curves  of  the  second-order  partial  differential  equation. 

Warning :  If  A(t,x)  =  0,  then  the  partial  differential  equation  admits  characteristic 
curves  with  vertical  tangents  that  cannot  be  parametrized  by  s  =  t.  For  example,  if 
A(t,x)  =  0,  then  the  vertical  lines  e.g.,  t  =  constant,  x  =  s,  are  characteristic,  satisfying 

(4.150) ,  but  do  not  appear  as  solutions  to  (4.151). 

For  example,  consider  the  hyperbolic  wave  equation 

Utt  —  C  Uxx 

According  to  (4.151),  any  characteristic  curve  that  is  given  by  the  graph  of  x(t)  must  solve 

( —  (?  —  0,  which  implies  that  ^  =  =b c. 

V  dt  J  ’  dt 

Thus,  in  accordance  with  our  previous  analysis,  the  characteristic  curves  are  the  straight 
lines  of  slope  =b  c,  and  there  are  two  characteristic  curves  passing  through  each  point  of  the 
(£,  x)-plane.  On  the  other  hand,  the  elliptic  Laplace  equation 


utt  +  Uxx 


has  no  (real)  characteristic  curves,  since  the  characteristic  equation  (4.150)  reduces  to 


and  ts  and  xs  are  not  allowed  to  vanish  simultaneously.  Finally,  for  the  parabolic  heat 
equation 

^XX  ^5 

the  characteristic  curve  equation  (4.150)  is  simply 


0 


(since  the  first-derivative  term  plays  no  role),  and  so  there  is  only  one  characteristic  curve 
passing  through  each  point,  namely  the  vertical  line  t  =  constant.  Observe  that  the  stan¬ 
dard  initial  value  problem  n(0,  x)  =  f(x)  for  the  heat  equation  takes  place  on  a  character¬ 
istic  curve  —  the  x-axis  —  but  does  not  take  the  form  of  a  Cauchy  problem,  which  would 
also  require  specifying  the  first-order  derivatives  ut  (0,  x),  ^(0,  x)  there.  And  indeed,  the 
standard  initial  value  problem  is  not  well-posed  near  the  characteristic  x-axis  for  negative 
t  <  0. 

In  general,  the  number  of  real  solutions  to  the  nondegenerate  quadratic  characteristic 
curve  equation  (4.150)  depends  on  its  discriminant  A  =  B2  —  A  AC:  In  the  hyperbolic 
case,  A  >  0,  and  there  are  two  real  characteristic  curves  passing  through  each  point;  in 
the  parabolic  case,  A  =  0,  and  there  is  just  one  real  characteristic  curve  passing  through 
each  point;  in  the  elliptic  case,  A  <  0,  and  there  are  no  real  characteristic  curves.  In  this 
manner,  elliptic,  parabolic,  and  hyperbolic  partial  differential  equations  are  distinguished 
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by  the  number  of  (real)  characteristic  curves  passing  through  a  point  —  namely,  zero, 
one,  and  two,  respectively.  First-order  partial  differential  equations  are  also  viewed  as 
hyperbolic ,  since  they  always  admit  real  characteristic  curves. 

With  further  analysis,  [35,  70,  122],  it  can  be  shown  that,  as  with  the  wave  equation, 
signals  and  disturbances  propagate  along  characteristic  curves.  Thus,  hyperbolic  equa¬ 
tions  share  many  qualitative  properties  with  the  wave  equation,  with  signals  moving  in 
two  different  directions.  For  example,  light  rays  move  along  characteristic  curves,  and  are 
thereby  subject  to  the  optical  phenomena  of  refraction  and  focusing.  Similarly,  since  the 
characteristic  curves  for  the  parabolic  heat  equation  are  the  vertical  lines,  this  indicates 
that  the  effect  of  a  disturbance  at  a  point  (t,x)  =  (t0,x0)  is  simultaneously  felt  along  the 
entire  contemporaneous  vertical  line  t  =  t0.  This  has  the  implication  that  disturbances  in 
the  heat  equation  propagate  at  infinite  speed  —  a  counterintuitive  fact  that  will  be  further 
expounded  on  in  Section  8.1.  Elliptic  equations  have  no  characteristics,  and  as  a  conse¬ 
quence,  do  not  support  propagating  signals;  indeed,  the  effect  of  a  localized  disturbance 
is  immediately  felt  throughout  the  domain.  For  example,  even  when  an  external  force  is 
concentrated  near  a  single  point,  it  displaces  the  entire  membrane. 


Exercises 

4.4.14.  Find  and  graph  the  real  characteristic  curves  for  each  of  the  partial  differential 
equations  in  Exercise  4.4.2. 

4.4.15.  Graph  the  characteristic  curves  for  the  Tricomi  equation  (4.137)  in  its  hyperbolic  region. 
What  happens  to  the  characteristics  as  one  approaches  the  parabolic  transition  boundary? 

4.4.16.  True  or  false:  The  characteristic  curves  of  the  Helmholtz  equation  uxx  +  u  —  u  =  0 
are  circles. 

4.4.17.  (a)  At  what  points  of  the  plane  is  the  partial  differential  equation  xuxx  +  Vuyy  ~  0 

elliptic?  parabolic?  hyperbolic?  (b)  How  many  characteristics  are  there  through  the  point 
(1,-1)?  (c)  Find  them  explicitly. 

o 

4.4.18.  Consider  the  partial  differential  equation  uxx  +  yuxy  —  y  . 

(a)  On  which  regions  of  the  (x,  y)-plane  is  the  equation  elliptic?  parabolic?  hyperbolic? 

( b )  Find  the  characteristics  in  the  hyperbolic  region. 

(c)  Find  the  general  solution  in  the  hyperbolic  region.  Hint:  Use  characteristic  coordinates. 

4.4.19.  Find  a  partial  differential  equation  whose  characteristic  curves  are: 

(a)  the  lines  x  —  y  =  a,  x  -\-  2 y  =  b,  where  a,  b  E  R  are  arbitrary  constants; 

(b)  the  exponential  curves  y  =  cex  for  c  E  1; 

(c)  the  concentric  circles  x2  +  y2  =  a  for  a  >  0,  and  the  rays  y  =  bx. 

4.4.20.  Prove  that  any  reparametrization  of  a  characteristic  curve  for  a  given  second-order 
linear  partial  differential  equation  is  also  a  characteristic  curve. 

4.4.21.  True  or  false:  You  can  uniquely  recover  a  second-order  partial  differential  equation  by 
knowing  all  its  characteristic  curves. 

0  4.4.22.  Prove  that  any  invertible  change  of  variables,  as  in  Exercise  4.4.7,  maps  the  character¬ 
istic  curves  of  the  original  linear  partial  differential  equation  to  the  characteristic  curves  of 
the  transformed  equation.  Thus,  characteristic  curves  are  intrinsic:  they  do  not  depend  on 
the  parametrization,  nor  on  the  coordinates  used  to  represent  the  partial  differential 
equation. 


Chapter  5 

Finite  Differences 


As  one  quickly  learns,  the  differential  equations  that  can  be  solved  by  explicit  analytic 
formulas  are  few  and  far  between.  Consequently,  the  development  of  accurate  numerical 
approximation  schemes  is  an  essential  tool  for  extracting  quantitative  information  as  well 
as  achieving  a  qualitative  understanding  of  the  possible  behaviors  of  solutions  to  the  vast 
majority  of  partial  differential  equations.  (On  the  other  hand,  the  successful  design  of 
numerical  algorithms  necessitates  a  fairly  deep  understanding  of  their  basic  analytic  prop¬ 
erties,  and  so  exclusive  reliance  on  numerics  is  not  an  option.)  Even  in  cases,  such  as 
the  heat  and  wave  equations,  in  which  explicit  solution  formulas  (either  in  closed  form  or 
infinite  series)  exist,  numerical  methods  can  still  be  profitably  employed.  Indeed,  one  can 
accurately  test  a  proposed  numerical  algorithm  by  running  it  on  a  known  solution.  As  we 
will  see,  the  lessons  learned  in  the  design  and  testing  of  numerical  algorithms  on  simpler 
“solved”  examples  are  of  inestimable  value  when  confronting  more  challenging  problems. 

Many  of  the  basic  numerical  solution  schemes  for  partial  differential  equations  can  be 
fit  into  two  broad  themes.  The  first,  to  be  presented  in  the  present  chapter,  is  that  of 
finite  difference  methods ,  obtained  by  replacing  the  derivatives  in  the  equation  by  appro¬ 
priate  numerical  differentiation  formulae.  We  thus  start  with  a  brief  discussion  of  some 
elementary  finite  difference  formulas  used  to  numerically  approximate  first-  and  second- 
order  derivatives  of  functions.  We  then  establish  and  analyze  some  of  the  most  basic  finite 
difference  schemes  for  the  heat  equation,  first-order  transport  equations,  the  second-order 
wave  equation,  and  the  Laplace  and  Poisson  equations.  As  we  will  learn,  not  all  finite  dif¬ 
ference  schemes  produce  accurate  numerical  approximations,  and  one  must  confront  issues 
of  stability  and  convergence  in  order  to  distinguish  reliable  from  worthless  methods.  In 
fact,  inspired  by  Fourier  analysis,  the  key  numerical  stability  criterion  is  a  consequence  of 
the  scheme’s  handling  of  complex  exponentials. 

The  second  category  of  numerical  solution  techniques  comprises  the  finite  element 
methods ,  which  will  be  the  topic  of  Chapter  10.  These  two  chapters  should  be  regarded  as 
but  a  preliminary  excursion  into  this  vast  and  active  area  of  contemporary  research.  More 
sophisticated  variations  and  extensions,  as  well  as  other  classes  of  numerical  integration 
schemes,  e.g.,  spectral,  pseudo-spectral,  multigrid,  multipole,  probabilistic  (Monte  Carlo, 
etc.),  geometric,  symplectic,  and  many  more,  can  be  found  in  specialized  numerical  analysis 
texts,  including  [6,51,60,80,94],  and  research  papers.  Also,  the  journal  Acta  Numerica 
is  an  excellent  source  of  survey  papers  on  state-of-the-art  numerical  methods  for  a  broad 
range  of  disciplines. 
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5  Finite  Differences 


5.1  Finite  Difference  Approximations 

In  general,  a  finite  difference  approximation  to  the  value  of  some  derivative  of  a  scalar 
function  u{x)  at  a  point  xQ  in  its  domain,  say  u'{x 0)  or  u"(x 0),  relies  on  a  suitable  com¬ 
bination  of  sampled  function  values  at  nearby  points.  The  underlying  formalism  used  to 
construct  these  approximation  formulas  is  known  as  the  calculus  of  finite  differences.  Its 
development  has  a  long  and  influential  history,  dating  back  to  Newton. 

We  begin  with  the  first-order  derivative.  The  simplest  finite  difference  approximation 
is  the  ordinary  difference  quotient 


u{x  +  h)  —  u(x) 
h 


uf  {pc) 


(5.1) 


which  appears  in  the  original  calculus  definition  of  the  derivative.  Indeed,  if  u  is  differen¬ 
tiable  at  x,  then  u'(x)  is,  by  definition,  the  limit,  as  h  — >  0  of  the  finite  difference  quotients. 
Geometrically,  the  difference  quotient  measures  the  slope  of  the  secant  line  through  the 
two  points  (x,  u{x))  and  {x  +  h ,  u(x  +  h))  on  its  graph.  For  small  enough  h,  this  should  be 
a  reasonably  good  approximation  to  the  slope  of  the  tangent  line,  uf(x),  as  illustrated  in 
the  hrst  picture  in  Figure  5.1.  Throughout  our  discussion,  h,  the  step  size ,  which  may  be 
either  positive  or  negative,  is  assumed  to  be  small:  \h  \  <C  1.  When  h  >  0,  (5.1)  is  referred 
to  as  a  forward  difference ,  while  h  <  0  yields  a  backward  difference. 

How  close  an  approximation  is  the  difference  quotient?  To  answer  this  question,  we 
assume  that  u{x)  is  at  least  twice  continuously  differentiable,  and  examine  its  first-order 
Taylor  expansion 


u{x  +  h)  =  u{x)  +  u'{x)  h  +  \  u"{f)  h‘ 


(5.2) 


at  the  point  x.  We  have  used  Lagrange’s  formula  for  the  remainder  term,  [8,  97],  in  which 
£,  which  depends  on  both  x  and  h,  is  a  point  lying  between  x  and  x  +  h.  Rearranging 
(5.2),  we  obtain 

u(x  +  h)  —  u{x ) 


h 


—  u{x)  =  \  u" '(£)  h. 


Thus,  the  error  in  the  finite  difference  approximation  (5.1)  can  be  bounded  by  a  multiple 
of  the  step  size: 

u{x  +  h)  —  u{x) 


h 


—  u'{x) 


<  c 

h 

where  C  =  max  |  |  u"{ ^)  |  depends  on  the  magnitude  of  the  second  derivative  of  the  function 
over  the  interval  in  question.  Since  the  error  is  proportional  to  the  hrst  power  of  h,  we 
say  that  the  finite  difference  quotient  (5.1)  is  a  first-order  approximation  to  the  derivative 
u'{x).  When  the  precise  formula  for  the  error  is  not  so  important,  we  will  write 


,,  \  u(x  +  h)  —  u(x) 

u\x)  =  — - f - —  +  0(h) 

h 


(5.3) 


The  “big  Oh”  notation  0(h)  refers  to  a  term  that  is  proportional  to  h,  or,  more  precisely, 
whose  absolute  value  is  bounded  by  a  constant  multiple  of  |  h  |  as  h  0. 


Example  5.1.  Let  u{x)  =  sinx.  Let  us  try  to  approximate 


u'{  1)  =  cos  1  =  .5403023... 
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Forward  difference  Central  difference 

Figure  5.1.  Finite  difference  approximations. 


by  computing  finite  difference  quotients 


cos  1 


sin(l  +  h)  —  sin  1 
h 


The  result  for  smaller  and  smaller  (positive)  values  of  h  is  listed  in  the  following  table. 


h 

.1 

.01 

.001 

.0001 

approximation 

.497364 

.536086 

.539881 

.540260 

error 

-.042939 

-.004216 

-.000421 

-.000042 

We  observe  that  reducing  the  step  size  by  a  factor  of  ^  reduces  the  size  of  the  error  by 
approximately  the  same  factor.  Thus,  to  obtain  10  decimal  digits  of  accuracy,  we  anticipate 
needing  a  step  size  of  about  h  =  10-11.  The  fact  that  the  error  is  more  or  less  proportional 
to  the  step  size  confirms  that  we  are  dealing  with  a  first-order  numerical  approximation. 


To  approximate  higher-order  derivatives,  we  need  to  evaluate  the  function  at  more 
than  two  points.  In  general,  an  approximation  to  the  nth  order  derivative  u  (n)(x) 

requires 

at  least  n  +  1  distinct  sample  points.  For  simplicity,  we  restrict  our  attention  to  equally 
spaced  sample  points,  although  the  methods  introduced  can  be  readily  extended  to  more 
general  configurations. 

For  example,  let  us  try  to  approximate  u"(x )  by  sampling  u  at  the  particular  points 
x,  x  +  /i,  and  x  —  h.  Which  combination  of  the  function  values  u(x  —  h) ,  u(x) ,  u(x  +  h ) 
should  be  used?  The  answer  is  found  by  consideration  of  the  relevant  Taylor  expansions’*' 

h  2  h 3 

u(x  +  h)  =  u(x)  +  u\x)  h  +  u"(x) - b  u"'(x) - b  0(/i4), 

2  6 

h2  h 3 

u(x  —  h)  =  u(x)  —  u\x)  h  +  u"(x) - u"'(x ) - b  0(/i4), 

2  6 

where  the  error  terms  are  proportional  to  h4.  Adding  the  two  formulas  together  yields 


u(x  +  h)  +  u(x  —  h)  =  2 u(x)  +  u"(x)  h 2  +  0(/i4). 


t  Throughout,  the  function  u(x)  is  assumed  to  be  sufficiently  smooth  so  that  any  derivatives 
that  appear  are  well  defined  and  the  expansion  formula  is  valid. 
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Dividing  by  h2  and  rearranging  terms,  we  arrive  at  the  centered  finite  difference  approxi¬ 
mation  to  the  second  derivative  of  a  function: 


u{x  +  h)  —  2  u{x)  +  u(x  —  h ) 

h 2 


+  0  (h2) 


Since  the  error  is  proportional  to  h2,  this  forms  a  second-order  approximation. 

2  2 

Example  5.2.  Let  u(x)  =  ex  ,  with  u"(x)  =  (4x2  +  2)e^  .  Let  us  approximate 


u"{  1)  =  6e  =  16.30969097... 
using  the  hnite  difference  quotient  (5.5): 


u"(l)  =  6e 


e(i+^)2  -  2e  + 

K2 


The  results  are  listed  in  the  following  table. 


h 

.1 

.01 

.001 

.0001 

approximation 

16.48289823 

16.31141265 

16.30970819 

16.30969115 

error 

.17320726 

.00172168 

.00001722 

.00000018 

Each  reduction  in  step  size  by  a  factor  of  ^  reduces  the  size  of  the  error  by  a  factor  of 
about  thereby  gaining  two  new  decimal  digits  of  accuracy,  which  confirms  that  the 

centered  finite  difference  approximation  is  of  second  order. 

However,  this  prediction  is  not  completely  borne  out  in  practice.  If  we  take  h  =  .00001 
then  the  formula  produces  the  approximation  16.3097002570,  with  an  error  of  .0000092863 
—  which  is  less  accurate  than  the  approximation  with  h  =  .0001.  The  problem  is  that 
round-off  errors  due  to  the  hnite  precision  of  numbers  stored  in  the  computer  (in  the  pre¬ 
ceding  computation  we  used  single-precision  floating-point  arithmetic)  have  now  begun  to 
affect  the  computation.  This  highlights  the  inherent  difficulty  with  numerical  differentia¬ 
tion:  Finite  difference  formulae  inevitably  require  dividing  very  small  quantities,  and  so 
round-off  inaccuracies  may  produce  noticeable  numerical  errors.  Thus,  while  they  typi¬ 
cally  produce  reasonably  good  approximations  to  the  derivatives  for  moderately  small  step 
sizes,  achieving  high  accuracy  requires  switching  to  higher-precision  computer  arithmetic. 
Indeed,  a  similar  comment  applies  to  the  previous  computation  in  Example  5.1.  Our  ex¬ 
pectations  about  the  error  were  not,  in  fact,  fully  justified,  as  you  may  have  discovered  had 
you  tried  an  extremely  small  step  size. 


Another  way  to  improve  the  order  of  accuracy  of  hnite  difference  approximations  is  to 
employ  more  sample  points.  For  instance,  if  the  first-order  approximation  (5.3)  to  v!  (x) 
based  on  the  two  points  x  and  x  +  h  is  not  sufficiently  accurate,  one  can  try  combining  the 
function  values  at  three  points,  say  x,  x  +  h,  and  x  —  h.  To  fold  the  appropriate  combination 
of  function  values  u{x  —  h),u(x),u(x  +  h),  we  return  to  the  Taylor  expansions  (5.4).  To 
solve  for  uf(x),  we  subtract  the  two  formulas,  and  so 


u(x  +  h)  —  u(x  —  h)  =  2 u\x)  h  +  0(/i3) 


Rearranging  the  terms,  we  are  led  to  the  well-known  centered  difference  formula 

u,  [x)  =  <x  +  h)_u_ 


(5.6) 
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which  is  a  second-order  approximation  to  the  first  derivative.  Geometrically,  the  centered 
difference  quotient  represents  the  slope  of  the  secant  line  passing  through  the  two  points 
(x  —  h,  u(x  —  h))  and  (x  +  /i,  u(x  +  h))  on  the  graph  of  u,  which  are  centered  symmetrically 
about  the  point  x.  Figure  5.1  illustrates  the  two  approximations,  and  the  advantage  of 
the  centered  difference  version  is  graphically  evident.  Higher-order  approximations  can  be 
found  by  evaluating  the  function  at  yet  more  sample  points,  say,  x  +  2h,  x  —  2/i,  etc. 

Example  5.3.  Return  to  the  function  u(x)  =  sinx  considered  in  Example  5.1.  The 
centered  difference  approximation  to  its  derivative  u'(l)  =  cos  1  =  .5403023  ...  is 


cos  1 


sin(l  +  h)  —  sin(l  —  h) 

2 h 


The  results  are  tabulated  as  follows: 


h 

.1 

.01 

.001 

.0001 

approximation 

.53940225217 

.54029330087 

.54030221582 

.54030230497 

error 

-.00090005370 

-.00000900499 

-.00000009005 

-.00000000090 

As  advertised,  the  results  are  much  more  accurate  than  the  one-sided  finite  difference 
approximation  used  in  Example  5.1  at  the  same  step  size.  Since  it  is  a  second-order 
approximation,  each  reduction  in  the  step  size  by  a  factor  of  ^  results  in  two  more  decimal 
places  of  accuracy  —  up  until  the  point  where  the  effects  of  round-off  error  kick  in. 


Many  additional  finite  difference  approximations  can  be  constructed  by  similar  ma¬ 
nipulations  of  Taylor  expansions,  but  these  few  very  basic  formulas,  along  with  a  couple 
that  are  derived  in  the  exercises,  will  suffice  for  our  purposes.  (For  a  thorough  treatment 
of  the  calculus  of  finite  differences,  the  reader  can  consult  [74].)  In  the  following  sections, 
we  will  employ  the  finite  difference  formulas  to  devise  numerical  solution  schemes  for  a  va¬ 
riety  of  partial  differential  equations.  Applications  to  the  numerical  integration  of  ordinary 
differential  equations  can  be  found,  for  example,  in  [24,  60,  63 


Exercises 


£  5.1.1.  Use  the  finite  difference  formula  (5.3)  with  step  sizes  h  =  .1,  .01,  and  .001  to  approximate 
the  derivative  u  (1)  of  the  following  functions  u(x).  Discuss  the  accuracy  of  your 

4  /  7  \  1  /  \  1  /  T\  /  X  ,  -1 


approximation.  (a)  x  ,  (b) 


1  +  X 


2  5 


(c)  log#,  (d)  cosx,  (e)  tan  1  x. 


X  5.1.2.  Repeat  Exercise  5.1.1  using  the  centered  difference  formula  (5.6).  Compare  your 

approximations  with  those  in  the  previous  exercise  —  are  the  values  in  accordance  with  the 
claimed  orders  of  accuracy? 

£  5.1.3.  Approximate  the  second  derivative  u"  (  1)  of  the  functions  in  Exercise  5.1.1  using  the 
finite  difference  formula  (5.5)  with  h  =  .1,  .01,  and  .001.  Discuss  the  accuracy  of  your 
approximations . 


5.1.4.  Construct  finite  difference  approximations  to  the  first  and  second  derivatives  of  a  func¬ 
tion  u(x)  using  its  values  at  the  points  x  —  k,x,x-\-h,  where  h,  k  <C  1  are  of  comparable  size, 
but  not  necessarily  equal.  What  can  you  say  about  the  error  in  the  approximation? 
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5.1.5.  In  this  exercise,  you  are  asked  to  derive  some  basic  one-sided  finite  difference  formulas , 
which  are  used  for  approximating  derivatives  of  functions  at  or  near  the  boundary  of  their 
domain,  (a)  Construct  a  finite  difference  formula  that  approximates  the  derivative  v!  (x) 
using  the  values  of  u(x)  at  the  points  x,  x  +  h,  and  x  +  2h.  What  is  the  order  of  your 
formula?  (b)  Find  a  finite  difference  formula  for  u"  (x)  that  involves  the  same  three  func¬ 
tion  values.  What  is  its  order?  (c)  Test  your  formulas  by  computing  approximations  to  the 

2 

first  and  second  derivatives  of  u(x)  =  ex  at  x  =  1  using  step  sizes  h  =  .1,  .01,  and  .001. 
What  is  the  error  in  your  numerical  approximations?  Are  the  errors  compatible  with  the 
theoretical  orders  of  the  finite  difference  formulas?  Discuss  why  or  why  not. 

(d)  Answer  part  (c)  at  the  point  x  =  0. 

X  5.1.6.  (a)  Using  the  function  values  u(x),  u(x  +  h),  u(x  +  3/i),  construct  a  numerical  approxi¬ 
mation  to  the  derivative  v!  (x).  (b)  What  is  the  order  of  accuracy  of  your  approximation? 

(c)  Test  your  approximation  on  the  function  u(x)  =  cosx  at  x  =  1  using  the  step  sizes 
h  =  .1,  .01,  and  .001.  Are  the  errors  consistent  with  your  answer  in  part  (b)? 

4  5.1.7.  Answer  Exercise  5.1.6  for  the  second  derivative  u" (x). 

5.1.8.  (a)  Find  the  order  of  the  five-point  centered  finite  difference  approximation 

—  u(x  +  2h)  +  8  u(x  +  h)  —  8  u(x  —  h)  +  u(x  —  2  h) 

12 h  ' 

(b)  Test  your  result  on  the  function  (l  +  x  )  atx  =  l  using  the  values  h  =  .l,.01,.001. 

5.1.9.  (a)  Using  the  formula  in  Exercise  5.1.8  as  a  guide,  find  five-point  finite  difference  formu¬ 
las  to  approximate  (i)  u" (x),  (ii)  u"(x),  (Hi)  ukIV\ x ).  What  is  the  order  of  accuracy? 

(b)  Test  your  formulas  on  the  function  ( 1  +  x  )  at  x  =  1  using  the  values  h  =  .l,.01,.001. 


u 


'(x) 
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Consider  the  heat  equation 

du 


d 2 


u 


dt 


7 


dx2 


0  <  x  <  f. 


t  >  0. 


on  an  interval  of  length  i,  with  constant  thermal  diffusivity  7  >  0. 
dependent  Dirichlet  boundary  conditions 


(5.7) 

We  impose  time- 


u(t,  0)  =  a(t) 


u(t ,  i)  —  /3(t) 


t  >  0. 


(5.8) 


fixing  the  temperature  at  the  ends  of  the  interval,  along  with  the  initial  conditions 


u(0,x)  =  f(x), 


0  <  x  <  i, 


(5.9) 


specifying  the  initial  temperature  distribution.  In  order  to  effect  a  numerical  approximation 
to  the  solution  to  this  initial-boundary  value  problem,  we  begin  by  introducing  a  rectangular 
mesh  consisting  of  nodes  ( t  •,  xm)  e  M2  with 


0  =  tQ  <  tx  <  t2  < 


and 


0  =  x0  <  xx  <  •  •  •  <  xn  =  £. 


For  simplicity,  we  maintain  a  uniform  mesh  spacing  in  both  directions,  with 


At  =  tj+1  -t-, 


3 


A  l 

Ax  —  xrn+1  xm  —  , 

/  L 
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representing,  respectively,  the  time  step  size  and  the  spatial  mesh  size.  It  will  be  essential 
that  we  do  not  a  priori  require  that  the  two  be  the  same.  We  shall  use  the  notation 

ujm  ~  u(tjixm)i  where  t-  =  j  At,  xra  =  m  Ax,  (5.10) 

to  denote  the  numerical  approximation  to  the  solution  value  at  the  indicated  node. 

As  a  first  attempt  at  designing  a  numerical  solution  scheme,  we  shall  employ  the 
simplest  finite  difference  approximations  to  the  derivatives  appearing  in  the  equation.  The 
second-order  space  derivative  is  approximated  by  the  centered  difference  formula  (5.5),  and 
hence 


d2u 
dx 2 


u{tpXm+ 1)  -2u(tj,xm)+u(tj,xm_1) 

(. Ax )2 

(-1  ^  ^ j,m 

( Ax )2 


+  0(  (Ax)2 ) 


+  0(  (Ax)2 ) 


(5.11) 


where  the  error  in  the  approximation  is  proportional  to  ( Ax )2.  Similarly,  the  one-sided 
finite  difference  approximation  (5.3)  is  used  to  approximate  the  time  derivative,  and  so 


du 

dt 


(t j  ,  Xrn) 


At 


+  0(A  t) 


^j-\-l,m  ^ j,m 

At 


+  0(A  t),  (5-12) 


where  the  error  is  proportional  to  At.  In  general,  one  should  try  to  ensure  that  the 
approximations  have  similar  orders  of  accuracy,  which  leads  us  to  require 


At  «  (A*)2.  (5.13) 

Assuming  Ax  <  i,  this  implies  that  the  time  steps  must  be  much  smaller  than  the  space 
mesh  size. 


Remark :  At  this  stage,  the  reader  might  be  tempted  to  replace  (5.12)  by  the  second- 
order  central  difference  approximation  (5.6).  However,  this  introduces  significant  compli¬ 
cations,  and  the  resulting  numerical  scheme  is  not  practical;  see  Exercise  5.2.10. 

Replacing  the  derivatives  in  the  heat  equation  (5.14)  by  their  finite  difference  approx¬ 
imations  (5.11, 12)  and  rearranging  terms,  we  end  up  with  the  linear  system 


Uj  +  l,m 


^Uj,m+ 1  +  (!  -  1> 


3=  0, 1,2. 

m  —  1, . . . 


,n-  1, 


(5.14) 


in  which 

7  At 

[i=  (Ax)2  ■ 

The  resulting  scheme  is  of  iterative  form,  whereby  the  solution  values  Uj+1  ~  xm) 

at  time  t  +1  are  successively  calculated,  via  (5.14),  from  those  at  the  preceding  time  t- . 

The  initial  condition  (5.9)  indicates  that  we  should  initialize  our  numerical  data  by 
sampling  the  initial  temperature  at  the  nodes: 


^0,m  fm  ^  l,...,n  1. 

Similarly,  the  boundary  conditions  (5.8)  require  that 


Uj,n  =  Pj  =  Pttj)’ 


(5.16) 


a  -  =  a(tj), 


j  =  0, 1,2, . . .  . 


(5.17) 
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For  consistency,  we  should  assume  that  the  initial  and  boundary  conditions  agree  at  the 
corners  of  the  domain: 


/o  =  /( 0)  =  w(0, 0)  =  a(0)  =  a0,  fn  =  f{i)  =  u(0,  i)  =  /3(0)  =  j30. 

The  three  equations  (5.14, 16, 17)  completely  prescribe  the  numerical  approximation  scheme 
for  the  solution  to  the  initial-boundary  value  problem  (5.7-9). 

Let  us  rewrite  the  preceding  equations  in  a  more  transparent  vectorial  form.  First,  let 


u 


(j)  - 


T 

)  «  (u(tj,x1),u(tj,x2), . . . ,  wy.,xn_1) ) 


u 


3,n- 


T 


(5.18) 


be  the  vector  whose  entries  are  the  numerical  approximations  to  the  solution  values  at  time 
t-  at  the  interior  nodes.  We  omit  the  boundary  nodes  (t-,x0),  (t-,xn),  since  those  values 
are  hxed  by  the  boundary  conditions  (5.17).  Then  (5.14)  takes  the  form 


u(j+i)  _  auO)  _j_ 

where 


/1-2/i 


h 

1  —  2/i  (i 

/i  l  —  2/i 

/i 


/i 


\ 


\ 


/i 

/i  1  —  2/iJ 


(5.19) 


/  /l  Oij  \ 

0 

0 


0 

\nPj  / 


(5.20) 


The  {n—  1)  x  {n—  1)  coefficient  matrix  A  is  symmetric  and  tridiagonal,  and  only  its  nonzero 
entries  are  displayed.  The  contributions  (5.17)  of  the  boundary  nodes  appear  in  the  vector 
E  Mn_1.  This  numerical  method  is  known  as  an  explicit  scheme ,  since  each  iterate  is 
computed  directly  from  its  predecessor  without  having  to  solve  any  auxiliary  equations  - 
unlike  the  implicit  schemes  to  be  discussed  next. 


Example  5.4.  Let  us  fix  the  diffusivity  7  =  1  and  the  interval  length  £  =  1.  For 
illustrative  purposes,  we  take  a  spatial  step  size  of  Ax  =  .1.  We  work  with  the  initial  data 


n(0,  x)  =  f(x)  = 


I 


—  X, 

v.  _  2 

x  5  ’ 
1  —  X, 


0  <  X  <  5, 

k  <  X  <  T7T, 


A—  <(  T  <(  1 


used  earlier  in  Example  4.1.  In  Figure  5.2  we  compare  the  numerical  solutions  resulting 
from  two  (slightly)  different  time  step  sizes.  The  first  row  uses  At  =  ( Ax )2  =  .01  and  plots 
the  solution  at  the  indicated  times.  The  numerical  solution  is  already  showing  signs  of 
instability  (the  final  plot  does  not  even  fit  in  the  window),  and  indeed,  soon  thereafter,  it 
becomes  completely  wild.  The  second  row  takes  At  =  .005.  Even  though  we  are  employing 
a  rather  coarse  mesh,  the  numerical  solution  is  not  too  far  away  from  the  true  solution  to 
the  initial  value  problem,  which  can  be  seen  in  Figure  4.1. 


Stability  Analysis 

In  light  of  the  preceding  calculation,  we  need  to  understand  why  our  numerical  scheme 
sometimes  gives  reasonable  answers  but  sometimes  utterly  fails.  To  this  end,  we  investigate 
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Figure  5.2.  Numerical  solutions  for  the  heat  equation  [+J 

based  on  the  explicit  scheme. 


the  effect  of  the  numerical  scheme  on  simple  functions.  As  we  know,  the  general  solution 
to  the  heat  equation  can  be  decomposed  into  a  sum  over  the  various  Fourier  modes.  Thus, 
we  can  concentrate  on  understanding  what  the  numerical  scheme  does  to  an  individual 
complex  exponential, ^  bearing  in  mind  that  we  can  then  reconstruct  its  effect  on  more 
general  initial  data  by  taking  suitable  linear  combinations  of  exponentials. 

To  this  end,  suppose  that,  at  time  t  =  t- ,  the  solution  is  a  sampled  exponential 

u(tj,x)  =  elkx ,  and  so  u-Tn  =  u(t-1xrn)  =  elkXrri1  (5.21) 

where  k  is  a  real  parameter.  Substituting  the  latter  values  into  onr  numerical  equations 
(5.14),  we  find  that  the  updated  value  at  time  t  ■+1  is  also  a  sampled  exponential: 


where 


^7,771+1  (1  2  m  ~b  1 

i/cxm+i  —  2 ii)elkXrn  +  fielkXrn 

j  ^ ^ _ 2/jJ)e[kXrn  |  ijjC^k^Xrri 

\  k  x  jyi 


fie 


=  Xe 


(5.22) 


(5.23) 


A  =  A (k)  =  /i eikAx  +  (1  -  2/i)  +  fi e~ikAx 

=  1—2/1  1  —  cos(fcAx)]  =  1  —  4  /i  sin2  ( ^  k  Ax) . 

Thus,  the  effect  of  a  single  step  is  to  multiply  the  complex  exponential  (5.21)  by  the 
magnification  factor  A : 

u(tj+ 1,x)  =  \elkx.  (5.24) 


As  usual,  complex  exponentials  are  easier  to  work  with  than  real  trigonometric  functions. 
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In  other  words,  elkx  plays  the  role  of  an  eigenfunction ,  with  the  magnification  factor  A (k) 
the  corresponding  eigenvalue ,  of  the  linear  operator  governing  each  step  of  the  numerical 
scheme.  Continuing  in  this  fashion,  we  find  that  the  effect  of  p  further  iterations  of  the 
scheme  is  to  multiply  the  exponential  by  the  pth  power  of  the  magnification  factor: 


u{tj+pl  x)  =  Xp  e 


i  kx 


(5.25) 


As  a  result,  the  stability  is  governed  by  the  size  of  the  magnification  factor:  If  |  A  |  >  1, 
then  Xp  grows  exponentially,  and  so  the  numerical  solutions  (5.25)  become  unbounded  as 
p  -+  oo,  which  is  clearly  incompatible  with  the  analytical  behavior  of  solutions  to  the 
heat  equation.  Therefore,  an  evident  necessary  condition  for  the  stability  of  our  numerical 
scheme  is  that  its  magnification  factor  satisfy 


A  <  1 


(5.26) 


This  method  of  stability  analysis  was  developed  by  the  mid-twentieth-century  Hun¬ 
garian/American  mathematician  —  and  father  of  the  electronic  computer  —  John  von 
Neumann.  The  stability  criterion  (5.26)  effectively  distinguishes  the  stable,  and  hence 
valid,  numerical  algorithms  from  the  unstable,  and  hence  ineffectual,  schemes.  For  the 
particular  case  (5.23),  the  von  Neumann  stability  criterion  (5.26)  requires 

—  1  <  1  —  4  fji  sin2  k  Ax)  <1,  or,  equivalently,  0  <  /r  sin2  /c  Ax)  <  |. 

Since  this  is  required  to  hold  for  all  possible  fc,  we  must  have 


0  <  /i  = 


7  At  ^  1 
(Ax)2  ~  2  ’ 


and  hence 


At  < 


(5.27) 


since  7  >  0.  Thus,  once  the  space  mesh  size  is  fixed,  stability  of  the  numerical  scheme 
places  a  restriction  on  the  allowable  time  step  size.  For  instance,  if  7  =  1,  and  the  space 
mesh  size  Ax  =  .01,  then  we  must  adopt  a  minuscule  time  step  size  At  <  .00005.  It 
would  take  an  exorbitant  number  of  time  steps  to  compute  the  value  of  the  solution  at 
even  moderate  times,  e.g.,  t  —  1.  Moreover,  the  accumulation  of  round-off  errors  might 
then  cause  a  significant  reduction  in  the  overall  accuracy  of  the  final  solution  values.  Since 
not  all  choices  of  space  and  time  steps  lead  to  a  convergent  scheme,  the  explicit  scheme 
(5.14)  is  called  conditionally  stable. 


Implicit  and  Crank-Nicolson  Methods 


An  unconditionally  stable  method  —  one  that  does  not  restrict  the  time  step  —  can  be 
constructed  by  replacing  the  forward  difference  formula  (5.12)  used  to  approximate  the 
time  derivative  by  the  backwards  difference  formula 


du 


/,  \  kl(tj ,  Xm)  u(tj_ i’)Xrn)  ~  /  /  a  ,\2  \ 

m  -  +  UVAij  ) 


(5.28) 


Substituting  (5.28)  and  the  same  centered  difference  approximation  (5.11)  for  uxx  into  the 
heat  equation,  and  then  replacing  j  by  j  +  1,  leads  to  the  iterative  system 


j  =  0,1,2,... 


m  =  1, . . . ,  n  —  1, 


(5.29) 
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Figure  5.3. 


Numerical  solutions  for  the  heat  equation  (+j 
based  on  the  implicit  scheme. 


where  the  parameter  /r  =  ^  At/(Ax )2  is  as  before.  The  initial  and  boundary  conditions 
have  the  same  form  (5.16, 17).  The  latter  system  can  be  written  in  the  matrix  form 

(5.30) 

where  A  is  obtained  from  the  matrix  A  in  (5.20)  by  replacing  g  by  -g.  This  serves  to 
define  an  implicit  scheme ,  since  we  have  to  solve  a  linear  system  of  algebraic  equations 
at  each  step  in  order  to  compute  the  next  iterate  u^+1\  However,  since  the  coefficient 
matrix  A  is  tridiagonal,  the  solution  can  be  computed  extremely  rapidly,  [89],  and  so  its 
calculation  is  not  an  impediment  to  the  practical  implementation  of  this  implicit  scheme. 

Example  5.5.  Consider  the  same  initial-boundary  value  problem  considered  in  Ex¬ 
ample  5.4.  In  Figure  5.3,  we  plot  the  numerical  solutions  obtained  using  the  implicit 
scheme.  The  initial  data  is  not  displayed,  but  we  graph  the  numerical  solutions  at  times 
t  =  .02,  .04,  .06  with  a  mesh  size  of  Ax  =  .1.  In  the  top  row,  we  use  a  time  step  of  At  =  .01, 
while  in  the  bottom  row  At  =  .005.  In  contrast  to  the  explicit  scheme,  there  is  very  little 
difference  between  the  two  —  indeed,  both  come  much  closer  to  the  actual  solution  than 
the  explicit  scheme.  In  fact,  even  significantly  larger  time  steps  yield  reasonable  numerical 
approximations  to  the  solution. 

Let  us  apply  the  von  Neumann  analysis  to  investigate  the  stability  of  the  implicit 
scheme.  Again,  we  need  only  look  at  the  effect  of  the  scheme  on  a  complex  exponential. 
Substituting  (5.21,24)  into  (5.29)  and  canceling  the  common  exponential  factor  leads  to 
the  equation 

\(-HeikAx  +  1  +  2 ij,- iie~ikAx)  =  1. 
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based  on  the  Crank-Nicolson  scheme. 


We  solve  for  the  magnification  factor 


A 


1 

1  +  2//(  1  —  cos(/c  Ax) ) 


1 

1  +  4  fi  sin2  k  Ax) 


(5.31) 


Since  p  >  0,  the  magnification  factor  is  always  less  than  1  in  absolute  value,  and  so  the 
stability  criterion  (5.26)  is  satisfied  for  any  choice  of  step  sizes.  We  conclude  that  the 
implicit  scheme  (5.14)  is  unconditionally  stable. 

Another  popular  numerical  scheme  for  solving  the  heat  equation  is  the  Crank-Nicolson 
method ,  due  to  the  British  numerical  analysts  John  Crank  and  Phyllis  Nicolson: 


^7  +  1,771  2  +  (^7  +  1,771+1  ^  ^7  +  1,771  ^7  +  1,771—  1  771+1  ^  ,777  ,771—  1  )  ’  (5.32) 

which  can  be  obtained  by  averaging  the  explicit  and  implicit  schemes  (5.14)  and  (5.29). 
We  can  write  (5.32)  in  vectorial  form 


Bu<i+1>  =  |  (b°)  +  b(i+1>) 


where 


B 


/ 1  +  M 

-jM 

\ 

/1-+  2^ 

\ 

-  + 

1  +  /Li 

-  + 

2  /X  1  —  /X 

-  + 

•  • 

,  s  = 

• 

'  •/ 

(5.33) 


are  both  tridiagonal. 

Applying  the  von  Neumann  analysis  as  before,  we  deduce  that  the  magnification  factor 
has  the  form 

1  —  2  fi  sin2  k  Ax) 

1  +  2//  sin2  fc  Ax) 


A 


(5.34) 
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Since  fi  >  0,  we  see  that  |  A  |  <  1  for  all  choices  of  step  size,  and  so  the  Crank-Nicolson 
scheme  is  also  unconditionally  stable.  A  detailed  analysis  based  on  a  Taylor  expansion  of 
the  solution  reveals  that  the  errors  are  of  order  (At)2  and  (Ax)2,  and  so  it  is  reasonable  to 
choose  the  time  step  to  have  the  same  order  of  magnitude  as  the  space  step:  At  «  Ax.  This 
gives  the  Crank-Nicolson  scheme  a  significant  advantage  over  the  previous  two  methods, 
in  that  one  can  get  away  with  far  fewer  time  steps.  However,  applying  it  to  the  initial 
value  problem  considered  above  reveals  a  subtle  weakness.  The  top  row  in  Figure  5.4  has 
space  and  time  step  sizes  At  =  Ax  =  .01,  and  does  a  reasonable  job  of  approximating 
the  solution  except  near  the  corners,  where  an  annoying  and  incorrect  local  oscillation 
persists  as  the  solution  decays.  The  bottom  row  uses  At  =  Ax  =  .001,  and  performs  much 
better,  although  a  similar  oscillatory  error  can  be  observed  at  much  smaller  times.  Indeed, 
unlike  the  implicit  scheme,  the  Crank-Nicolson  method  fails  to  rapidly  damp  out  the  high- 
frequency  Fourier  modes  associated  with  small-scale  features  such  as  discontinuities  and 
corners  in  the  initial  data,  although  it  performs  quite  well  in  smooth  regimes.  Thus,  when 
dealing  with  irregular  initial  data,  a  good  strategy  is  to  first  run  the  implicit  scheme  until 
the  small-scale  noise  is  dissipated  away,  and  then  switch  to  Crank-Nicolson  with  a  much 
larger  time  step  to  determine  the  later  large  scale  dynamics. 

Finally,  we  remark  that  the  finite  difference  schemes  developed  above  for  the  heat 
equation  can  all  be  readily  adapted  to  more  general  parabolic  partial  differential  equations. 
The  stability  criteria  and  observed  behaviors  are  fairly  similar,  and  a  couple  of  illustrative 
examples  can  be  found  in  the  exercises. 


Exercises 


5.2.1.  Suppose  we  seek  to  approximate  the  solution  to  the  initial-boundary  value  problem 
ut  =  5 uxx,  u(t ,  0)  =  u(t,  3)  =  0,  u(0,  x)  =  x(x  —  l)(x  —  3),  0  <  x  <  3, 

by  employing  the  explicit  scheme  (5.14).  (a)  Given  the  spatial  mesh  size  Ax  =  .1,  what 
range  of  time  steps  At  can  be  used  to  produce  an  accurate  numerical  approximation? 

(b)  Test  your  prediction  by  implementing  the  scheme  using  one  value  of  At  in  the  allowed 
range  and  one  value  outside. 


5.2.2.  Solve  the  following  initial-boundary  value  problem 


u4 


u 


XX  ’ 


with  initial  data  f(x) 


u(t ,  0) 
2 
0 


u(t ,  1)  =  0,  u( 0,  x)  =  /(x),  0  <  x  <  1. 


x  —  ~ 
x  6 


1 

3  ’ 


-  —  3 
2  ° 


~  _  5 

X  g 


0  <  X  <  jj, 
I  <  x  <  2 

3  —  'l  —  3  ’ 

|  <  X  <  1, 


using 


(i)  the  explicit  scheme  (5.14);  (ii)  the  implicit  scheme  (5.29);  and  (in)  the  Crank-Nicolson 
scheme  (5.32).  Use  space  step  sizes  Ax  =  .1  and  .05,  and  suitably  chosen  time  steps  At. 
Discuss  which  features  of  the  solution  can  be  observed  in  your  numerical  approximations. 


5.2.3.  Repeat  Exercise  5.2.2  for  the  initial-boundary  value  problem  ut  =  3 uxxl  u( 0,  x)  =  0, 
u(t,  —1)  =  1,  u(t,  1)  =  —1,  using  space  step  sizes  Ax  =  .2  and  .1. 


5.2.4.  (a)  Solve  the  initial-boundary  value  problem 

ut  =  uxxi  u(£,  —  1)  =  u(t,  1)  =  0,  u( 0,  x) 


x 


-1  <  X  <  1, 


using  (i)  the  explicit  scheme  (5.14);  (ii)  the  implicit  scheme  (5.29);  (Hi)  the  Crank-Nicolson 
scheme  (5.32).  Use  Ax  =  .1  and  an  appropriate  time  step  At.  Compare  your  numerical 
solutions  at  times  t  =  0,  .01, ,  .02,  .05,  .1,  .3,  .5,  1.0,  and  discuss  your  findings,  (b)  Repeat 
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part  (a)  for  the  implicit  and  Crank-Nicolson  schemes  with  Ax  =  .01.  Why  aren’t  you  being 
asked  to  implement  the  explicit  scheme? 


5.2.5.  Use  the  implicit  scheme  with  spatial  mesh  sizes  Ax  =  .1  and  .05  and  appropriately  cho¬ 
sen  values  of  the  time  step  At  to  investigate  the  solution  to  the  periodically  forced  bound¬ 
ary  value  problem  ut  =  uxx ,  a(0,  x)  =  0,  u(t,  0)  =  sin57rt,  u(t,  1)  =  cos57rt.  Is  your 
solution  periodic  in  time? 


C  5.2.6.  (a)  How  would  you  modify  (i)  the  explicit  scheme;  (ii)  the  implicit  scheme;  to  deal  with 
Neumann  boundary  conditions?  Hint :  Use  the  one-sided  finite  difference  formulae  found  in 
Exercise  5.1.5  to  approximate  the  derivatives  at  the  boundary. 

(b)  Test  your  proposals  on  the  boundary  value  problem 


1  1 

Ut  =  Uxx,  u{ 0,  x)  =  2  +  COS  27TX  —  ^  COS  37TX,  ux  (t,  0)  =  0  =  Ux  (£,  1) , 

using  space  step  sizes  Ax  =  .1  and  .01  and  appropriate  time  steps.  Compare  your  nu¬ 
merical  solution  with  the  exact  solution  at  times  t  =  .01,  .03,  .05,  and  explain  any  dis¬ 
crepancies. 


5.2.7.  (a)  Design  an  explicit  numerical  scheme  for  approximating  the  solution  to  the  initial¬ 
boundary  value  problem 

ut  =  7 uxx  +  s(x),  u(t,  0)  =  u(t,  1)  =  0,  u{ 0,  x)  =  /(x),  0  <  x  <  1, 

for  the  heat  equation  with  a  source  term  s(x).  (b)  Test  your  scheme  when 


7  =  jjr ,  s(x)  =  x(l  —  x)(10  —  22x), 


/  0)  =  < 


x~u 


1 

3  ’ 


0, 

1 

2 


X  —  i 


0  <  x  < 

I  <  x  <  2 

3  —  ■l  —  3  ’ 

|  <  X  <  1, 


using  space  step  sizes  Ax  =  .1  and  .05,  and  a  suitably  chosen  time  step  At.  Are  your  two 
numerical  solutions  close?  (c)  What  is  the  long-term  behavior  of  the  solution?  Can  you 
find  a  formula  for  its  eventual  profile?  (d)  Design  an  implicit  scheme  for  the  same  problem. 
Does  this  affect  the  behavior  of  your  numerical  solution?  What  are  the  advantages  of  the 
implicit  scheme? 


5.2.8.  Consider  the  initial-boundary  value  problem  for  the  lossy  diffusion  equation 

du  d2u  ,  t>  0, 

0  <  x  <  1, 


dt  dx2 


—  au ,  u(t,  0)  =  u(t,  1)  =  0,  t^(0,x)  =  /(x), 


where  a  >  0  is  a  positive  constant,  (a)  Devise  an  explicit  finite  difference  method  for 
computing  a  numerical  approximation  to  the  solution,  (b)  For  what  mesh  sizes  would 
you  expect  your  method  to  provide  a  good  approximation  to  the  solution? 

(c)  Discuss  the  case  when  a  <  0. 

5.2.9.  Consider  the  initial-boundary  value  problem  for  the  diffusive  transport  equation 


du 

~dt 


d2u 
dx 2 


+  2 


du 

dx 


u(t,  0)  =  u(t,  1)  =  0,  a(0,  x)  =  x(l  —  x), 


t  >  0, 

0  <  x  <  1. 


(a)  Devise  an  explicit  finite  difference  scheme  for  computing  numerical  approximations  to 
the  solution.  Hint :  Make  sure  your  approximations  are  of  comparable  order,  (b)  For  what 
range  of  time  step  sizes  would  you  expect  your  method  to  provide  a  decent  approximation 
to  the  solution?  (c)  Test  your  answer  in  part  (b)  for  the  spatial  step  size  Ax  =  .1. 


5.2.10.  (a)  Show  that  using  the  centered  difference  approximation  (5.6)  to  approximate  the 
time  derivative  leads  to  Richardson’s  method  for  numerically  solving  the  heat  equation: 

X  3  =  1,2, ...  , 

lb 


U 


J +  1,771 


U 


3 


1  ,ra  T  2/1  Uj,m- 


m  =  1, . . . ,  n  —  1, 


where  /i  =  7  At/ (Ax) 2  is  as  in  (5.15).  (b)  Discuss  how  to  start  Richardson’s  method, 
(c)  Discuss  the  stability  of  Richardson’s  method,  (d)  Test  Richardson’s  method  on  the 
initial-boundary  value  problem  in  Exercise  5.2.2.  Does  your  numerical  solution  conform 
with  your  expectations  from  part  (b)? 
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5.3  Numerical  Algorithms  for 

First-Order  Partial  Differential  Equations 


Let  us  next  apply  the  method  of  finite  differences  to  construct  some  basic  numerical  meth¬ 
ods  for  first-order  partial  differential  equations.  As  noted  in  Section  4.4,  first-order  partial 
differential  equations  are  prototypes  for  hyperbolic  equations,  and  so  many  of  the  lessons 
learned  here  carry  over  to  the  general  hyperbolic  regime,  including  the  second-order  wave 
equation,  which  we  analyze  in  detail  in  the  following  section. 

Consider  the  initial  value  problem  for  the  elementary  transport  equation 


du  du 
+  c  — —  =  0. 


n(0,  x)  =  f(x) 


—  OO  <  X  <  oo. 


(5.35) 


dt  dx 

with  constant  wave  speed  c.  Of  course,  as  we  learned  in  Section  2.2,  the  solution  is  a  simple 
traveling  wave 

n(t,  x)  =  f(x  —  ct)  (5.36) 


that  is  constant  along  the  characteristic  lines  of  slope  c  in  the  (£,  x)-plane.  Although  the 
analytical  solution  is  completely  elementary,  there  will  be  valuable  lessons  to  be  learned 
from  our  attempt  to  reproduce  it  by  numerical  approximation.  Indeed,  each  of  the  nu¬ 
merical  schemes  developed  below  has  an  evident  adaptation  to  transport  equations  with 
variable  wave  speeds  c(t,x),  and  even  to  nonlinear  transport  equations  whose  wave  speed 
depends  on  the  solution  n,  and  so  admit  shock-wave  solutions. 

As  before,  we  restrict  our  attention  to  a  rectangular  mesh  (£-,xm)  with  uniform  time 
step  size  At  =  t-+1  —  t-  and  space  mesh  size  Ax  =  xrn+1  —  xm.  We  use  Uj  ~  u(tj,xm) 
to  denote  our  numerical  approximation  to  the  solution  u(t,  x)  at  the  indicated  node.  The 
simplest  numerical  scheme  is  obtained  by  replacing  the  time  and  space  derivatives  by  their 
first-order  finite  difference  approximations  (5.1): 


du 


(f,,*™)  «=  +0(At), 


du 

dx^3 


iuxj  «  Ar  i  OiA,-). 


(5.37) 

Substituting  these  expressions  into  the  transport  equation  (5.35)  leads  to  the  explicit  nu¬ 
merical  scheme 

1  (5.38) 


in  which  the  parameter 


c  At 

Ax 


(5.39) 


depends  on  the  wave  speed  and  the  ratio  of  time  to  space  step  sizes.  Since  we  are  employ¬ 
ing  first-order  approximations  to  both  derivatives,  we  should  choose  the  step  sizes  to  be 
comparable:  At  ~  Ax.  When  working  on  a  bounded  interval,  say  0  <  x  <  £,  we  will  need 
to  specify  a  value  for  the  numerical  solution  at  the  right  end,  e.g.,  setting  u-  n  =  0,  which 
corresponds  to  imposing  the  boundary  condition  u(t,£)  =  0. 

In  Figure  5.5,  we  plot  the  numerical  solutions,  at  times  t  =  .1,  .2,  .3,  arising  from  the 
following  initial  condition: 


u(0,x)  =  f(x)  =  .4e-300(x--5)2  +  .le“300(x“'65)2.  (5.40) 

We  use  step  sizes  At  =  Ax  =  .005,  and  try  four  different  values  of  the  wave  speed.  The 
cases  c  =  .5  and  c—  —1.5  clearly  exhibit  some  form  of  numerical  instability.  The  numerical 
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Figure  5.5. 


Numerical  solutions  to  the  transport  equation. 


solution  when  c  =  — . 5  is  a  bit  more  reasonable,  although  one  can  already  observe  some 
degradation  due  to  the  relatively  low  accuracy  of  the  scheme.  This  can  be  alleviated  by 
employing  a  smaller  step  size.  The  case  c  —  —  1  looks  exceptionally  good,  and  yon  are 
asked  to  provide  an  explanation  in  Exercise  5.3.6. 


The  CFL  Condition 

There  are  two  ways  to  understand  the  observed  numerical  instability.  First,  we  recall 
that  the  exact  solution  (5.36)  is  constant  along  the  characteristic  lines  x  =  ct  +  £,  and 
hence  the  value  of  u(t,x)  depends  only  on  the  initial  value  /(£)  at  the  point  £  =  x  —  ct. 
On  the  other  hand,  at  time  t  =  £■,  the  numerical  solution  u-^  u(tj,xm)  computed 
using  (5.38)  depends  on  the  values  of  u-_l  rn  and  u-_l  rn+l.  The  latter  two  values  have 
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x 


x 


The  CFL  condition. 


been  computed  from  the  previous  approximations  uj-2,m+ii  uj- 2,m+2-  And  so 

on.  Going  all  the  way  back  to  the  initial  time  t0  =  0,  we  find  that  u-  m  depends  on  the 
initial  values  u0  =  /(xm),  . . .  ,  u0  rn+j  =  /(xm  +  j  Ax)  at  the  nodes  lying  in  the  interval 
xm  <  x  <  xm  +  j  Ax.  On  the  other  hand,  the  actual  solution  i/(£-,xm)  depends  only  on 
the  value  of  /(£),  where 

£  =  Xm~  Ctj  =  Xm~  CJAt- 


Thus,  if  £  lies  outside  the  interval  [xm,xm  +  j  Ax],  then  varying  the  initial  condition 
near  the  point  x  =  £  will  change  the  actual  solution  value  i/(£-,xm)  without  altering  its 
numerical  approximation  u-  m  at  all!  So  the  numerical  scheme  cannot  possibly  provide  an 
accurate  approximation  to  the  solution  value.  As  a  result,  we  must  require 


xm<£,  =  Xm~  C3  At<Xm+  j  Ax, 


and  hence 


0  <  —  c  At  <  Ax, 


which  we  rewrite  as 


^  c At 

0  >  a  =  — —  >  — 1, 
Ax  “ 


or,  equivalently, 


Ax 

A t 


<c<  0. 


(5.41) 


This  is  the  simplest  manifestation  of  what  is  known  as  the  Courant-Friedrichs-Lewy  con¬ 
dition ,  or  CFL  condition  for  short,  which  was  established  in  the  groundbreaking  1928 
paper  [33]  by  three  of  the  pioneers  in  the  development  of  numerical  methods  for  partial 
differential  equations:  the  German  (soon  to  be  American)  applied  mathematicians  Richard 
Courant,  Kurt  Friedrichs,  and  Hans  Lewy.  Note  that  the  CFL  condition  requires  that  the 
wave  speed  be  negative ,  and  the  time  step  size  not  too  large.  Thus,  for  allowable  wave 
speeds,  the  finite  difference  scheme  (5.38)  is  conditionally  stable. 

The  CFL  condition  can  be  recast  in  a  more  geometrically  transparent  manner  as 
follows.  For  the  finite  difference  scheme  (5.38),  the  numerical  domain  of  dependence  of  a 
point  (t-,xm)  is  the  triangle 


=  {(*.®)|  0  <  t  <  tj,  xm  <  X  <  xm  +  tj  -  t)  .  (5.42) 

The  reason  for  this  nomenclature  is  that,  as  we  have  just  seen,  the  numerical  approximation 
to  the  solution  at  the  node  (£-,xm)  depends  on  the  computed  values  at  the  nodes  lying 
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within  its  numerical  domain  of  dependence;  see  Figure  5.6.  The  CFL  condition  (5.41) 
requires  that,  for  all  0  <  t  <  tjf  the  characteristic  passing  through  the  point  (t-,xm)  he 
entirely  within  the  numerical  domain  of  dependence  (5.42).  If  the  characteristic  ventures 
outside  the  domain,  then  the  scheme  will  be  numerically  unstable.  With  this  geometric 
reformulation,  the  CFL  criterion  can  be  applied  to  both  linear  and  nonlinear  transport 
equations  that  have  nonuniform  wave  speeds. 

The  CFL  criterion  (5.41)  is  reconfirmed  by  a  von  Neumann  stability  analysis.  As 
before,  we  test  the  numerical  scheme  on  an  exponential  function.  Substituting 


?/  —  O  i  xm 

Ujj.rn  °  i 


7/  —  X  p  Xrn 


(5.43) 


into  (5.38)  leads  to 


^  g  i  fc  g  i  k  X m~l-l  |  |  1  )  0  ^  ^  'Em  ^ ^  i  k  A X 

The  resulting  (complex)  magnification  factor 

i  k  Ax 


ae 


+  cr  +  l) 


i  kxm 


A  =  l  +  cr(l  —  e1  x)  =  (l  +  cr  —  a  cos(kAx) )  —  i  cr  sin(fc  Ax) 


satishes  the  stability  criterion  |  A  |  <  1  if  and  only  if 


2  2 

( 1  +  a  —  a  cos (k  Ax) )  +  ( a  sin(fc  Ax) ) 

l  +  2cr(cr  +  l)(l  —  cos(kAx) )  =  1  +  4<j((j  +  1)  sin2 k Ax)  <  1 


for  all  k.  Thus,  stability  requires  that  a  (a  +  1)  <  0,  and  thus  —  1  <  a  <  0,  in  complete 
accord  with  the  CFL  condition  (5.41). 


Upwind  and  Lax-  Wendroff  Schemes 


To  obtain  a  finite  difference  scheme  that  can  be  used  for  positive  wave  speeds,  we  replace  the 
forward  finite  difference  approximation  to  du/dx  by  the  corresponding  backwards  difference 
quotient,  namely,  (5.1)  with  h  =  — Ax ,  leading  to  the  alternative  first-order  numerical 
scheme 

uj+l,m  =  -(o-  1)  Ujtm  +  aUj,m- 1>  (5-44) 

where  a  =  cAt/Ax  is  as  before.  A  similar  analysis,  left  to  the  reader,  produces  the 
corresponding  CFL  stability  criterion 


0  <  a 


cAt 

Ax 


< 


1 


5 


and  so  this  scheme  can  be  applied  for  suitable  positive  wave  speeds. 

In  this  manner,  we  have  produced  one  numerical  scheme  that  works  for  negative  wave 
speeds,  and  an  alternative  scheme  for  positive  speeds.  The  question  arises  —  particularly 
when  one  is  dealing  with  equations  with  variable  wave  speeds  —  whether  one  can  devise 
a  scheme  that  is  (conditionally)  stable  for  both  positive  and  negative  wave  speeds.  One 
might  be  tempted  to  use  the  centered  difference  approximation  (5.6): 


du 

dx 


+  0(  (Ax)2 ). 


Ax 


(5.45) 
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x 


Figure  5.7. 


The  CFL  condition  for  the  centered  difference  scheme. 


Substituting  (5.45)  and  the  previous  approximation  to  the  time  derivative  (5.37)  into  (5.35) 
leads  to  the  numerical  scheme 


Uj  +  l,mn 


2  ^ j,m  2  ^  — 


(5.46) 


where,  as  usual,  a  =  c  At/ Ax.  In  this  case,  the  numerical  domain  of  dependence  of  the 
node  consists  of  the  nodes  in  the  triangle 


T(tj,xm)  =  {xn l  o  <*<*,-> 


x 


777 


—  tj+t<x<xm  +  tj—t} 


(5.47) 


The  CFL  condition  requires  that,  for  o  <  *  <  F  the  characteristic  going  through  (t-,xm) 
lie  within  this  triangle,  as  in  Figure  5.7,  which  imposes  the  condition 


c  At 

Ax 


or,  equivalently, 


c 


< 


Ax 

A t 


(5.48) 


Unfortunately,  although  it  satisfies  the  CFL  condition  over  this  range  of  wave  speeds,  the 
centered  difference  scheme  is,  in  fact,  always  unstable !  For  instance,  the  instability  of  the 
numerical  solution  to  the  preceding  initial  value  problem  (5.40)  for  c  —  1  can  be  observed 
in  Figure  5.8.  This  is  confirmed  by  applying  a  von  Neumann  analysis:  substitute  (5.43) 
into  (5.46),  and  cancel  the  common  exponential  factors.  Provided  d/  0,  which  means  that 
c  /  0,  the  resulting  magnification  factor 


A  =  1  —  i  a  sin(fc  Ax) 

satisfies  |A|  >  1  for  all  k  with  sin(/cAx)  ^  0.  Thus,  for  c  /  0,  the  centered  difference 
scheme  (5.46)  is  unstable  for  all  (nonzero)  wave  speeds! 
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Figure  5.8. 


Centered  difference  numerical  solution  to  the  transport  equation. 


One  possible  means  of  overcoming  the  sign  restriction  on  the  wave  speed  is  to  use 
the  forward  difference  scheme  (5.38)  when  the  wave  speed  is  negative  and  the  backwards 
scheme  (5.44)  when  it  is  positive.  The  resulting  scheme,  valid  for  varying  wave  speeds 
c(£,x),  takes  the  form 


Uj  +  l,m 


—  (1  ■  11  ■ 

3,m  “0, 


^ j ,m^j ,m—li  ^j^rri  ^  ^ 


where 


At 


aj,rn  Ci’rn  /\x  ’ 


c  <0 

u j,m  — 

C  >0 

j,m  ^ 

(5.49) 

)• 

(5.50) 

This  is  referred  to  as  an  upwind  scheme ,  since  the  second  node  always  lies  “upwind” 
that  is,  away  from  the  direction  of  motion  —  from  the  reference  point  (£-,xm).  The 
upwind  scheme  works  reasonably  well  over  short  time  intervals,  assuming  that  the  space 
step  size  is  sufficiently  small  and  the  time  step  satisfies  the  CFL  condition  Ax/ At  <  |  c  - 
at  each  node,  cf.  (5.41).  However,  over  longer  time  intervals,  as  we  already  observed  in 
Figure  5.5,  the  simple  upwind  scheme  tends  to  produce  a  noticeable  damping  of  waves  or, 
alternatively,  require  an  unacceptably  small  step  size.  One  way  of  overcoming  this  defect  is 
to  use  the  popular  Lax-  Wendroff  scheme,  which  is  based  on  second-order  approximations 
to  the  derivatives.  In  the  case  of  constant  wave  speed,  the  iterative  step  takes  the  form 


uj+l,m  =  \  ~  O2  -  1)uj,m  +  l(T(cr+1)uj,m-l-  (5-51) 

The  stability  analysis  of  the  Lax-Wendroff  scheme  is  relegated  to  the  exercises.  Extensions 
to  variable  wave  speeds  are  more  subtle,  and  we  refer  the  reader  to  [80]  for  a  detailed 
derivation. 


Exercises 

5.3.1.  Solve  the  initial  value  problem  ut  =  3 ux,  u( 0,  x)  =  1/(1  +  x2),  on  the  interval  [—10, 10] 
using  an  upwind  scheme  with  space  step  size  Ax  =  .1.  Decide  on  an  appropriate  time  step 
size,  and  graph  your  solution  at  times  t  =  .5,  1, 1.5.  Discuss  what  you  observe. 
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5.3.2.  Solve  Exercise  5.3.1  for  the  nonuniform  transport  equations 

(a)  ut  +  4  (1  +  x2)-1  ux  =  0,  (b)  ut  =  (3  —  2e~x  ux, 

(c)  ut  +  7x  (1  +  x2)-1  ux  =  0,  (d)  ut  +  (2  tan-1  ^  x)  ux  =  0. 

5.3.3.  Consider  the  initial  value  problem 

ut  +  x2^_  1ux=0’  u(0,x)  =  (l  - \x2)e~x  /3. 

On  the  interval  [  —  5,5],  using  space  step  size  Ax  =  .1  and  time  step  size  At  =  .025,  apply 

(a)  the  forward  scheme  (5.38)  (suitably  modified  for  variable  wave  speed),  (b)  the  back¬ 
ward  scheme  (5.44)  (suitably  modified  for  variable  wave  speed),  and  (c)  the  upwind  scheme 
(5.49).  Graph  the  resulting  numerical  solutions  at  times  t  =  .5, 1, 1.5,  and  discuss  what  you 
observe  in  each  case.  Which  of  the  schemes  are  stable? 

5.3.4.  Use  the  centered  difference  scheme  (5.46)  to  solve  the  initial  value  problem  in  Exercise 
5.3.1.  Do  you  observe  any  instabilities  in  your  numerical  solution? 

5.3.5.  Use  the  Lax-Wendroff  scheme  (5.51)  to  solve  the  initial  value  problem  in  Exercise  5.3.1. 
Discuss  the  accuracy  of  your  solution  in  comparison  with  the  upwind  scheme. 

0  5.3.6.  Can  you  explain  why,  in  Figure  5.5,  the  numerical  solution  in  the  case  c  —  —1  is  signifi¬ 
cantly  better  than  for  c  =  —.5,  or,  indeed,  for  any  other  c  in  the  stable  range. 

5.3.7.  Nonlinear  transport  equations  are  often  solved  numerically  by  writing  them  in  the  form 
of  a  conservation  law,  and  then  applying  the  finite  difference  formulas  directly  to  the  con¬ 
served  density  and  flux,  (a)  Devise  an  upwind  scheme  for  numerically  solving  our  favorite 

nonlinear  transport  equation,  ut  +  ^  (u  )x  =  0. 

_  2 

(b)  Test  your  scheme  on  the  initial  value  problem  u( 0,  x)  =  e  x  . 

5.3.8.  (a)  Design  a  stable  numerical  solution  scheme  for  the  damped  transport  equation 

ut  +  +  u  =  0.  (b)  Test  your  scheme  on  the  initial  value  problem  with  u(0,  x)  =  e  x  . 

Q  5.3.9.  Analyze  the  stability  of  the  numerical  scheme  (5.44)  by  applying  (a)  the  CFL  condition; 
(b)  a  von  Neumann  analysis.  Are  your  conclusions  the  same? 

^  5.3.10.  For  what  choices  of  step  size  At,  Ax  is  the  Lax-Wendroff  scheme  (5.51)  stable? 
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Let  us  now  develop  some  basic  numerical  solution  techniques  for  the  second-order  wave 
equation.  As  above,  although  we  are  in  possession  of  the  explicit  d’Alembert  solution 
formula  (2.82),  the  lessons  learned  in  designing  viable  schemes  here  will  carry  over  to  more 
complicated  situations,  including  inhomogeneous  media  and  higher-dimensional  problems, 
for  which  analytic  solution  formulas  may  no  longer  be  readily  available. 

Consider  the  wave  equation 


d2u 
dt 2 


d2u 
dx 2  ’ 


0  <  x  <  £,  t  >  0, 


(5.52) 


on  a  bounded  interval  of  length  £  with  constant  wave  speed  c  >  0.  For  specificity,  we 
impose  (possibly  time-dependent)  Dirichlet  boundary  conditions 


t  >  0, 


u(t,  0)  =  <a(t), 


u(t,£)  =  j3(t), 


(5.53) 
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along  with  the  usual  initial  conditions 

du 

u(0,x)  =  f(x),  —  (0  ,x)  =  g(x),  0<x<£.  (5.54) 

As  usual,  we  adopt  a  uniformly  spaced  mesh 

t-  =  j  At.  x‘  =  m  Ax.  where  Ax  =  — . 

J  m  n 

Discretization  is  implemented  by  replacing  the  second-order  derivatives  in  the  wave  equa¬ 
tion  by  their  standard  finite  difference  approximations  (5.5): 


d2u 
dt 2 


d2u 
dx 2 


ra+l 


)  -  2 u{tj,xm 

(A  t)2 

)  - 

(Ax)2 


)  +  u(tj 1,xm) 


)  +  u(tj,xm  j) 


+  0((A  t)2), 
+  0((Ax)2). 


(5.55) 


Since  the  error  terms  are  both  of  second  order,  we  anticipate  being  able  to  choose  the 
space  and  time  step  sizes  to  have  comparable  magnitudes:  At  ~  Ax.  Substituting  the 
finite  difference  formulas  (5.55)  into  the  partial  differential  equation  (5.52)  and  rearranging 
terms,  we  are  led  to  the  iterative  system 


+  ®  ^7,771+1 


+  2(1-  a2)  Ua  +  a2  Ua  _i  -  it. 


j  —  1,2,...  , 


~  J’m-1  m  =  1, . . . ,  n  —  1 


(5.56) 


for  the  numerical  approximations  u-  m  u(t^xm)  to  the  solution  values  at  the  nodes.  The 


parameter 


a 


3 

cAt 

Ax 


>  0 


(5.57) 


depends  on  the  wave  speed  and  the  ratio  of  space  and  time  step  sizes.  The  boundary 
conditions  (5.53)  require  that 


uj,o  =  <Xj  =  ujn  =  /3j  =  /3(tj),  j  =  0,1,2,.... 

This  allows  us  to  rewrite  the  iterative  system  in  vectorial  form 


(5.58) 


u(j+i)  _  _gu0’)  _  UU  D  _|_  b^) 


(5.59) 


where 


/  2  (1  —  a2)  cr 


\ 


B 


(j‘ 


2  (1  —  a2)  a: 

.2 


G‘ 


(J‘ 


\ 


a2  2  {l -a2)) 


U,.l  \ 


u 


(j)  - 


UX2 


Uj,n-2 


V  — 1  / 


hU)  = 


/  a  aj  \ 
0 


0 


\  cr  Pj  J 
(5.60) 

The  entries  of  G  Mn_1  are,  as  in  (5.18),  the  numerical  approximations  to  the  solution 
values  at  the  interior  nodes.  Note  that  (5.59)  describes  a  second- order  iterative  scheme , 
since  computing  the  subsequent  iterate  u^+1^  requires  knowing  the  values  of  the  preceding 
two:  and 

The  one  subtlety  is  how  to  get  the  method  started.  We  know  u^°\  since  its  entries 
u0  m  ~  fm  ~  f(xm )  are  determined  by  the  initial  position.  However,  we  also  need 
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in  order  to  launch  the  iteration  and  compute  u^3\  ....  Its  entries  u±  m  u(At,  xm) 
approximate  the  solution  at  time  t1  =  At,  whereas  the  initial  velocity  ut( 0,x)  =  g(x) 
prescribes  the  derivatives  ut( 0,xm)  =  gm  =  g(zm)  at  the  initial  time  t0  =  0.  To  resolve 
this  difficulty,  a  first  thought  might  be  to  use  the  finite  difference  approximation 


M(At,xm)  -  u(0,xm) 

At 


u 


1 ,  rn 


—  f 

■J  m 


At 


(5.61) 


to  compute  the  required  values  ux  m  =  /m  +  At.  However,  the  approximation  (5.61)  is 
accurate  only  to  order  At,  whereas  the  rest  of  the  scheme  has  errors  proportional  to  (At)2. 
The  effect  would  be  to  introduce  an  unacceptably  large  error  at  the  initial  step,  and  the 
resulting  solution  would  fail  to  conform  to  the  desired  order  of  accuracy. 

To  construct  an  initial  approximation  to  u^1)  with  error  on  the  order  of  (At)2,  we  need 
to  analyze  the  error  in  the  approximation  (5.61)  in  more  depth.  Note  that,  by  Taylor’s 
Theorem, 


u(At,  Xm)  ~  ^(0,  Xm)  du  ,  x  1  d2u  ,  XA  . /A 

1  mJ  K  m>  -  (0,xm)  +  -^(0,xJAt  +  O((At)2) 


At 


dt 

du 


dt  '  2  dx 2 

since  u(t,  x)  solves  the  wave  equation.  Therefore 


2  dt 2 
r2  d2v 

(°.  Xm)  +  17  ^2  (°»  Xm)  At  +  °(  ( At  f  ) 


du 


(?  d2u 


u 


1 ,  rn 


u(At,  xm)  w  u(0,  xm)  +  —  (0,  xm)A t  +  —  a-o  (0,  xm)(A t) 


c 


dt 


2  dx 2 


f(xm)  +  9{xm)  At  +  —  f  (xm)(A t) 


~  f  +  q  At  + 

J  rn  1  V  rrt  1 


C 


a 


m+ 1 


2/m  +  fm- 1)(^)2 

2  (Ax)2 


where  the  last  line,  which  employs  the  finite  difference  approximation  (5.5)  to  the  sec¬ 
ond  derivative,  can  be  used  if  the  explicit  formula  for  f"[x )  is  either  not  known  or  too 
complicated  to  evaluate  directly.  Therefore,  we  initiate  the  scheme  by  setting 


ui,m  =  ^2/m+i  +  (l-o-2)/m  +  N2/m-i+5,mAt,  (5.62) 

or,  in  vectorial  form, 

u(0)  =  f,  u(1)  =  pu(0)  +gAf+  |b(0),  (5.63) 

where  f  =  ( /1;  /2,  ■  •  ■ ,  fn-i )  >  §  =  (da  92i  ■  ■  ■  1 9n-i)  >  are  the  sampled  values  of  the 

initial  data.  This  serves  to  maintain  the  desired  second-order  accuracy  of  the  scheme. 


Example  5.6.  Consider  the  particular  initial  value  problem 


u 


tt 


u 


u(  0,  x) 


_  —  400  (x-.3): 


Ut(  0,x)  =  0. 


XX  5 


u(t,  0)  =  u{t,  1)  =  0. 


0  <  x  <  1 

t  >  0, 


subject  to  homogeneous  Dirichlet  boundary  conditions  on  the  interval  [0,1].  The  initial 
data  is  a  fairly  concentrated  hump  centered  at  x  =  .3.  As  time  progresses,  we  expect  the 
initial  hump  to  split  into  two  half-sized  humps,  which  then  collide  with  the  ends  of  the 
interval,  reversing  direction  and  orientation. 


204 


5  Finite  Differences 


l\ 

AA 

A  A 

t  =  0 

t  =  .1 

t  =  .2 

A 

j\ 

A 

V 

V 

t  =  .3 

t  =  .4 

t  =  .5 

Figure  5.9.  Numerically  stable  waves.  (+J 

j\ 

A 

AA 

t  =  0 

t  =  .04 

t  =  .08 

Figure  5.10. 


Numerically  unstable  waves.  (+J 


For  our  numerical  approximation,  let  us  use  a  space  discretization  consisting  of  90 
equally  spaced  points,  and  so  Ax  =  ^  =  .0111 ... .  If  we  choose  a  time  step  of  At  =  .01, 
whereby  a  —  .9,  then  we  obtain  a  reasonably  accurate  solution  over  a  fairly  long  time 
range,  as  plotted  in  Figure  5.9.  On  the  other  hand,  if  we  double  the  time  step,  setting 
At  =  .02,  so  a  =  1.8,  then,  as  shown  in  Figure  5.10,  we  induce  an  instability  that  eventually 
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x 


x 


condition  for  the  wave  equation. 


t 


overwhelms  the  numerical  solution.  Thus,  the  preceding  numerical  scheme  appears  to  be 
only  conditionally  stable. 


Stability  analysis  proceeds  along  the  same  lines  as  in  the  first-order  case.  The  CFL 
condition  requires  that  the  characteristics  emanating  from  a  node  (£  •,  xm)  remain,  for  times 
0  <  £  <  tj,  in  its  numerical  domain  of  dependence,  which,  for  our  particular  numerical 
scheme,  is  the  same  triangle 

=  {  (*,  ®)  |  0  <t<tj,  xm  -  tj  +  t  <  X  <  xm  +  tj  -  t  }  , 

now  plotted  in  Figure  5.11.  Since  the  characteristics  are  the  lines  of  slope  ±c,  the  CFL 
condition  is  the  same  as  in  (5.48): 


c  At 


a 


<  1 


or,  equivalently, 


0  <  c  < 


Ax 


(5.64) 


Ax  -  7  7  "  "7  ~At 

The  resulting  stability  criterion  explains  the  observed  difference  between  the  numerically 
stable  and  unstable  cases. 

However,  as  we  noted  above,  the  CFL  condition  is,  in  general,  only  necessary  for  stabil¬ 
ity  of  the  numerical  scheme;  sufficiency  requires  that  we  perform  a  von  Neumann  stability 
analysis.  To  this  end,  we  specialize  the  calculation  to  a  single  complex  exponential  elkx . 
After  one  time  step,  the  scheme  will  have  the  effect  of  multiplying  it  by  the  magnification 
factor  A  =  A (fc),  after  another  time  step  by  A2,  and  so  on.  To  determine  A,  we  substitute 
the  relevant  sampled  exponential  values 


uj-i,m  =  elkXm ,  uj}7n  =  \eikx™,  uj+l  m  =  A2  e'kXm ,  (5.65) 

into  the  scheme  (5.56).  After  canceling  the  common  exponential,  we  find  that  the  magni¬ 
fication  factor  satisfies  the  following  quadratic  equation: 


A2  =  ( 2  —  4 <r2  sin2  k  Ax) )  A  —  1 , 
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whence 


\  =  \/ a2  —  1,  where  a  =  1  —  2cr2  sin2  k  Ax) 


(5.66) 


Thus,  there  are  two  different  magnification  factors  associated  with  each  complex  expo¬ 
nential  —  which  is,  in  fact,  a  consequence  of  the  scheme  being  of  second  order.  Stability 
requires  that  both  be  <  1  in  modulus.  Now,  if  the  CFL  condition  (5.64)  holds,  then 
a\  <  1,  which  implies  that  both  magnification  factors  (5.66)  are  complex  numbers  of 
modulus  |  A  |  =  1,  and  thus  the  numerical  scheme  satisfies  the  stability  criterion  (5.26). 
On  the  other  hand,  if  a  >  i,  then  a  <  —  1  over  a  range  of  values  of  /c,  which  implies  that 
the  two  magnification  factors  (5.66)  are  both  real  and  one  of  them  is  <  -i,  thus  violating 
the  stability  criterion.  Consequently,  the  CFL  condition  (5.64)  does  indeed  distinguish 
between  the  stable  and  unstable  finite  difference  schemes  for  the  wave  equation. 


Exercises 


5.4.1.  Suppose  you  are  asked  to  numerically  approximate  the  solution  to  the  initial-boundary 
value  problem 


u 


tt 


64 u  ,  u(t,  0)  =  u(t,  3)  =  0,  u(0,  x) 


1  -  2|x  -  1 

0, 


1  <  x  <  4 

2  -  x  -  2’ 

otherwise. 


ut( 0,  x )  =  0. 


on  the  interval  0  <  x  <  3,  using  (5.56)  with  space  step  size  Ax  =  .1.  (a)  What  range  of 
time  steps  At  are  allowed?  (b)  Test  your  answer  by  implementing  the  numerical  solution 
for  one  value  of  At  in  the  allowable  range  and  one  value  outside.  Discuss  what  you  observe 
in  your  numerical  solutions,  (c)  In  the  stable  range,  compare  your  numerical  solution  with 
that  obtained  using  the  smaller  step  size  Ax  =  .01  and  a  suitable  time  step  At. 


5.4.2.  Solve  Exercise  5.4.1  for  the  boundary  value  problem 
utt  =  64uxxl  u(£,  0)  =  0  =  u(£,  3),  u(0,  x)  =  0,  ut( 0,x) 


/  1  -2 

x  —  1 

l  o, 

1  <  x  <  4 

2  -  x  -  2  ’ 

otherwise. 


4  + 


X  —  T 


2a;  —  | 


ut( 0,  x)  =  0. 


5.4.3.  Solve  the  following  initial-boundary  value  problem 

utt  =  9 uxxl  u(t,  0)  =  u(t ,  1)  =  0,  u( 0,  x)  —  2 

on  the  interval  0  <  x  <  1,  using  the  numerical  scheme  (5.56)  with  space  step  sizes  Ax  = 
.1,  .01  and  .001  and  suitably  chosen  time  steps.  Discuss  which  features  of  the  solution  can 
be  observed  in  your  numerical  approximations. 


5.4.4.  (a)  Use  a  numerical  integrator  with  space  step  size  Ax  =  .05  to  solve  the  periodically 
forced  boundary  value  problem 

uu  =  uxx ,  u(0,  x)  =  ut( 0,  x)  =  0,  u(t,  0)  =  sint,  u(t,  1)  =  0. 

Is  your  solution  periodic?  (b)  Repeat  the  computation  using  the  alternative  boundary 
condition  u(£,  0)  =  sin7rt.  Discuss  any  observed  differences  between  the  two  problems. 

5.4.5.  (a)  Design  an  explicit  numerical  scheme  for  solving  the  initial-boundary  value  problem 

utt  =  (?uxx  +  F(t ,  x),  u(£,  0)  =  u(t,  1)  =  0,  u(0,  x)  =  /(x),  tq(0,  x)  =  g(x ),  0  <  x  <  1, 

for  the  wave  equation  with  an  external  forcing  term  F{t ,  x).  Clearly  state  any  stability 
conditions  that  need  to  be  imposed  on  the  time  and  space  step  sizes. 

(b)  Test  your  scheme  on  the  particular  case  c  =  | ,  F(t,x)  =  3sign(x  —  sin7rt,  /(x)  = 

g(x)  =  0,  using  space  step  sizes  Ax  =  .05  and  .01,  and  suitably  chosen  time  steps. 

5.4.6.  Let  (3  >  0.  (a)  Design  a  finite  difference  scheme  for  approximating  the  solution  to  the 
initial-boundary  value  problem 

uu  +  f3ut  =  c2uxx,  a(t,  0)  =  a(t,  1)  =  0,  u(0,x)  =  /(x),  iq(0,  x)  =  g(x), 
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for  the  damped  wave  equation  on  the  interval  0  <  x  <  1.  (b)  Discuss  the  stability  of  your 
scheme.  What  choice  of  step  sizes  will  ensure  stability?  (c)  Test  your  scheme  with  c  —  1, 

/3  =  1,  using  the  initial  data  /(x)  =  e_(x_-7)  ,  g(x)  =  0. 


5.5  Finite  Difference  Algorithms  for 

the  Laplace  and  Poisson  Equations 


Finally,  let  us  discuss  the  implementation  of  finite  diffference  numerical  schemes  for  elliptic 
boundary  value  problems.  We  concentrate  on  the  simplest  cases:  the  two-dimensional 
Laplace  and  Poisson  equations.  The  basic  issues  are  already  apparent  in  this  particular 
context,  and  extensions  to  more  general  equations,  higher  dimensions,  and  higher-order 
schemes  are  all  reasonably  straightforward.  In  Chapter  10,  we  will  present  a  competitor 
the  renowned  finite  element  method  —  which,  while  relying  on  more  sophisticated 
mathematical  machinery,  enjoys  several  advantages,  including  more  immediate  adaptability 
to  variable  mesh  sizes  and  more  sophisticated  geometries. 

For  specificity,  we  concentrate  on  the  Dirichlet  boundary  value  problem 


Au  =  -uxx~uyy  =  /(*.y) 

u(x,y)  =  g(x,  y). 


for 


(x,y)  G  n 
(x,y)  G  dtt, 


(5.67) 


on  a  bounded  planar  domain  Q  c  R2.  The  first  step  is  to  discretize  the  domain  Q  by 
constructing  a  rectangular  mesh.  Thus,  the  finite  difference  method  is  particularly  suited 
to  domains  whose  boundary  lines  up  with  the  coordinate  axes;  otherwise,  the  mesh  nodes 
do  not,  generally,  lie  exactly  on  <90,  making  the  approximation  of  the  boundary  data  more 
challenging  —  although  not  insurmountable. 

For  simplicity,  let  us  study  the  case  in  which 


0  =  {a<x<6,  c  <  y  <  d} 


is  a  rectangle, 
respectively,  by 


We  introduce  a  regular  rectanglar  mesh,  with  x  and  y  spacings  given, 


Ax 


b  —  a 
m 


c  —  d 


n 


for  positive  integers  m,  n.  Thus,  the  interior  of  the  rectangle  contains  (m  —  l)(n—  1)  interior 
nodes 

(xi:  y  -)  =  (a  +  i  Ax,  c  +  j  Ay)  for  0  <  i  <  m,  0  <  j  <  n. 

In  addition,  the  2m  +  2n  boundary  nodes  (x0 ,  y  -)  =  (a,  y  -),  (xm,y  )  =  (6,  y  ),  (xi,yQ)  = 
(x^,c),  (xi:yn)  =  (x^d),  lie  on  the  boundary  of  the  rectangle. 

At  each  interior  node,  we  employ  the  centered  difference  formula  (5.5)  to  approximate 
the  relevant  second-order  derivatives: 


u(xi+1,yj)  -  2u(xi,yj)  +  u(xi_1,yj) 

(Ax)2 

uixjyyj+i)  -  yAxi^yj)  +  u(xi,yj_1) 

(Ay)2 


+  0(  (Ax)2 ), 
+  0(  (Ay)2 ) . 


(5.68) 
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Substituting  these  finite  difference  formulae  into  the  Poisson  equation  produces  the  linear 
system 


ui+l,j  ^  ui,j+ui-l, 


3 


(Ax): 


(Ay)2 


fi 


i  =  1 


m  —  1 


j  =  1 


(5.69) 


in  which  ui  ■  denotes  our  numerical  approximation  to  the  solution  values  u(xi,y  •)  at  the 
nodes,  while  /,.■  =  fix,,  y).  If  we  set 


(5.70) 


then  (5.69)  can  be  rewritten  in  the  form 


2(1  +  p2)uitj  -  (u-_1;i  +  ui+1J)  -  +  ui  j+1)  =  (Ax)2fi  j, 

i  =  1 , . . . ,  ra  —  1,  j  =  1 , . . 

Since  both  hnite  difference  approximations  (5.68)  are  of  second  order, 

Ax  and  Ay  to  be  of  comparable  size,  thus  keeping  p  around  1. 

The  linear  system  (5.71)  forms  the  hnite  difference  approximation  to  the  Poisson 
equation  at  the  interior  nodes.  It  is  supplemented  by  the  discretized  Dirichlet  boundary 
conditions 

Ui,0  =  9i,  0’ 

u0  ,j  =  9o  ,j  ’ 


\,j  9m,j  ’ 


z  =  0,...,  m, 
j  =  0, . . . ,  n. 


(5.72) 


(5.71) 

. ,  n  —  1. 

one  should  choose 


These  boundary  values  can  be  substituted  directly  into  the  system,  making  (5.71)  a  system 
of  (m— l)(n— 1)  linear  equations  involving  the  (m— l)(n— 1)  unknowns  ui  ■  for  1  <  i  <  m— 1, 
1  <  J  <  n  —  1.  We  impose  some  convenient  ordering  for  these  entries,  e.g.,  from  left  to 
right  and  then  bottom  to  top,  forming  the  column  vector  of  unknowns 


w  (uq,!^,  ...  ,  rC(m_i)(n_i) ) 


T 


(^1,1,  ^2,1  ’  *  *  *  5  — 1,1’  ^1,2’  ^2,2’  '  •  ’  Um  — 1,2’  ^1,3’  *  *  *  ’  Um  —  l,n  —  1 


T 


(5.73) 


The  combined  linear  system  (5.71-72)  can  then  be  rewritten  in  matrix  form 


(5.74) 

where  the  right-hand  side  is  obtained  by  combining  the  column  vector  f  =  (  . . .  . . .  )T 

with  the  boundary  data  provided  by  (5.72)  according  to  where  they  appear  in  the  system. 
The  implementation  will  become  clearer  once  we  work  through  a  small-scale  example. 


Example  5.7.  To  better  understand  how  the  process  works,  let  ns  look  at  the  case 
in  which  12  =  {0<x<l,  0<y<l}is  the  unit  square.  In  order  to  write  everything  in 
full  detail,  we  start  with  a  very  coarse  mesh  with  Ax  =  Ay  =  see  Figure  5.12.  Thus 
m  —  n  —  4,  resulting  in  a  total  of  nine  interior  nodes.  In  this  case,  p  —  1,  and  hence  the 
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Figure  5.12.  Square  mesh  with  Ax  =  Ay  = 


finite  difference  system  (5.71)  consists  of  the  following  nine  equations: 


^1,0 


Un  n 


2,0 


Uq  n 


3,0 


U  i  i 


1,1 


Un  i 


2,1 


U  Q  1 


3,1 


U  i  n 


U2  2  +  4n 


1,2 


2,2 


Un  n 


3,2 


u2  3  +  4u, 


1,1 

~  U2,l 

~  U  1.2  ~ 

1 

16 

/l,l’ 

2,1 

~  U3,l 

~  U2,2  = 

1 

16 

f 2,1") 

3,1 

—  u41 

—  U3,2  ~ 

1 

16 

f 3,li 

1,2 

~  U2,2 

~  U 1 , 3  = 

1 

16 

/l,2’ 

2,2 

~  U3,2 

—  U2.3  ~ 

1 

16 

f 2,2  ’ 

3,2 

~  UA,2 

~  U3.3  ~ 

1 

16 

f 3,2  ’ 

1,3 

~  U2,3 

—  u14  = 

1 

16 

/l,3’ 

2,3 

~  U3,3 

—  ^2,4  = 

1 

16 

f 2,3  ’ 

3,3 

~  UA,3 

—  U3A  = 

1 

16 

f 3,3* 

nodes,  ut 

0,0’  n4,0’  U 

0,4’  ^4,4’ 

(5.75) 


boundary  data  imposes  the  additional  conditions  (5.72),  namely 


uo,i  ~  #o,i 
ua,i  ~  9a, 1 


uo,2  ~  9 0,2 
UA,2  —  9a, 2 


U0,3  —  9  0,35 
^4,3  =  #4,3  ’ 


^1,0  —  #1,0’ 

^1,4  =  #1,4’ 


^2,0  —  #2,0’ 

^2,4  =  #2,4’ 


^3,0  —  #3,0’ 
^3,4  =  #3,4* 


The  system  (5.75)  can  be  written 


4 

-1 

-1 

4 

0 

-1 

-1 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

in  matrix  form  Aw  =  f,  where 


0 

-1 

0 

0 

0 

0 

°\ 

1 

0 

-1 

0 

0 

0 

0 

4 

0 

0 

-1 

0 

0 

0 

0 

4 

-1 

0 

-1 

0 

0 

0 

-1 

4 

-1 

0 

-1 

0 

1 

0 

-1 

4 

0 

0 

-1 

0 

-1 

0 

0 

4 

-1 

0 

0 

0 

-1 

0 

-1 

4 

-1 

0 

0 

0 

-1 

0 

-1 

4/ 

(5.76) 
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and 


w  = 


iwl\ 

(  Ul,l  ^ 

w2 

U2,l 

w3 

T— 1 

CO 

w4 

Ul,2 

W5 

— 

U2,2 

we 

U3,2 

w7 

Ul,3 

w8 

U2,3 

\Wg/ 

'%,3  ^ 

5 


/  16  A,1  +  #1,0  +  #0,1  \ 
16  A,1  +  #2,0 

TE  A,i  +  #3,0  +  94,1 

16  A, 2  +  #0,2 
16  A, 2 

16  A, 2  +  #4,2 
IF  A, 3  +  #0,3  +  #1,4 
16  A, 3  +  #2,4 
'  IF  A, 3  +  #4,3  +  #3,4  ' 


Note  that  the  known  boundary  values,  namely  ui  •  =  ge  •  when  z  or  j  equals  0  or  4,  have 

been  incorporated  into  the  right-hand  side  f  of  the  finite  difference  linear  system  (5.74). 
The  resulting  linear  system  is  easily  solved  by  Gaussian  Elimination,  [89].  Finer  meshes 
lead  to  correspondingly  larger  linear  systems,  all  endowed  with  a  common  overall  structure, 
as  discussed  below. 

For  example,  the  function 

zz(x,  y )  =  ysm(TTx) 


solves  the  particular  boundary  value  problem 


—  Au  =  7r2z/sin(7rx),  u{x,  0)  =  zz(0,  y)  =  zz(l,  y)  =  0,  u{x,  1)  =  sin(7rx),  0<x,z/<F 

Setting  up  and  solving  the  linear  system  (5.75)  produces  the  finite  difference  solution  values 

ul  i  =  .1831,  u1  2  =  .2589,  uiz  =  -1831, 

u2  i  =  .3643,  it2  2  =  .5152,  w23  =  .3643, 

u3,i  =  -5409,  u3  2  =  .7649,  u3  3  =  .5409, 

leading  to  the  numerical  approximation  plotted  in  the  first  graph^  of  Figure  5.13.  The 
maximal  error  between  the  numerical  and  exact  solution  values  is  .01520,  which  occurs  at 
the  center  of  the  square.  In  the  second  and  third  graphs,  the  mesh  spacing  is  successively 
reduced  by  half,  so  there  are,  respectively,  m  —  n  —  8  and  16  nodes  in  each  coordinate 
direction.  The  corresponding  maximal  numerical  errors  at  the  nodes  are  .004123  and 
.001035.  Observe  that  halving  the  step  size  reduces  the  error  by  a  factor  of  which  is 
consistent  with  the  numerical  scheme  being  of  second  order. 


Remark :  The  preceding  test  is  a  particular  instance  of  the  method  of  manufactured 
solutions ,  in  which  one  starts  with  a  preselected  function  that  almost  certainly  is  not 
a  solution  to  the  exact  problem  at  hand.  Nevertheless,  substituting  this  function  into 
the  differential  equation  and  the  relevant  initial  and/or  boundary  conditions  leads  to  an 
inhomogeneous  problem  of  the  same  character  as  the  original.  After  running  the  numerical 
scheme  on  the  modified  problem,  one  can  test  for  accuracy  by  comparing  the  numerical 
output  with  the  preselected  function. 


^  We  are  using  flat  triangles  to  interpolate  the  nodal  data.  Smoother  interpolation  schemes, 
e.g.,  splines,  [102],  will  produce  a  more  realistic  reproduction  of  the  analytic  solution  graph. 
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Figure  5.13.  Finite  difference  solutions  to  a  Poisson  boundary  value  problem. 


Solution  Strategies 

The  linear  algebraic  system  resulting  from  a  finite  difference  discretization  can  be  rather 
large,  and  it  behooves  us  to  devise  efficient  solution  strategies.  The  general  finite  difference 
coefficient  matrix  A  has  a  very  structured  form,  which  can  already  be  inferred  from  the 
very  simple  case  (5.76).  When  the  underlying  domain  is  a  rectangle,  it  assumes  a  block 
tridiagonal  form 


A  = 


B p 

1 

to 

HH 

1 

to 

HH 

Bp 

1 

to 

HH 

1 

to 

HH 

Bp 

1 

to 

HH 

1  •’ 

to 

HH 

Bp 

HH 

(N 

1 

(to  —  1) 

identity 

matrix, 

while 

\ 


-p2  i 

bp  > 


(5.77) 


/  2(1  +p2) 

-P2 


Bp  = 


2^  - p 2 

2(1  +  p2) 
-P2 


~P 

2(1  +  p2) 
-P2 


~P 

2(1  +  p2) 


P‘ 


P 2  2(1  +  p2)  -p2 

2  o/i  i  ^2 


\ 


V 


- p 2  2  (1  +  p2)  / 


(5.78) 


is  itself  an  ( mn  —  1)  x  (m  —  1)  tridiagonal  matrix.  (Here  and  below,  all  entries  not  explicitly 
indicated  are  zero.)  There  are  n  —  1  blocks  in  both  the  row  and  column  directions. 

When  the  finite  difference  linear  system  is  of  moderate  size,  it  can  be  efficiently  solved 
by  Gaussian  Elimination,  which  effectively  factorizes  A  =  LU  into  a  product  of  lower 
and  upper  triangular  matrices.  (This  follows  since  A  is  symmetric  and  nonsingular,  as 
guaranteed  by  Theorem  5.8  below.)  In  the  present  case,  the  factors  are  block  bidiagonal 
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matrices: 


(5.79) 


where  the  individual  blocks  are  again  of  size  {m—  1)  x  {m—  1).  Indeed,  multiplying  out  the 
matrix  product  LU  and  equating  the  result  to  (5.77)  leads  to  the  iterative  matrix  system 


U1  =  Bp ,  Lj  =  -p2Uj1,  Uj+1  =  Bp  + p2Lj,  j  =  l, ...  ,n  —  2,  (5.80) 

which  produces  the  individual  blocks. 

With  the  LU  factors  in  place,  we  can  apply  Forward  and  Back  Substitution  to  solve 
the  block  tridiagonal  linear  system  Aw  =  f  by  solving  the  block  lower  and  upper  triangular 
systems 

L  z  =  f ,  U  w  =  z.  (5.81) 

In  view  of  the  forms  (5.79)  of  L  and  [/,  if  we  write 


/  wW  \ 

/  z(1)  \ 

(  f(1)  \ 

w<2> 

z(2) 

f(2) 

W  = 

• 

,  Z  = 

• 

,  f  = 

• 

\w("_1)  / 

\z (n_1)  ) 

1)  / 

so  that  each  w^\  z^\  is  a  vector  with  m  —  1  entries,  then  we  must  successively  solve 


z(1)  =  f(1), 


z 


(j  +  t)  =  f  (J  +  1)  _  £  ,z(j) 


J 


j  =  l,2,...,n-  2, 


w' 


(n— l)  =z(n-i)?  [/ wW  =ZW  —  p2  w(/c+i),  fc  =  n  —  2,  n  —  3, . . . ,  1 


(5.82) 


in  the  prescribed  order.  In  view  of  the  identification  of  L-  with  —  p2  times  the  inverse  of 
£/•,  the  last  set  of  equations  in  (5.82)  is  perhaps  better  written  as 


w(/c)  =Lj(  w(fc+1)  -  p~2  z(fe)),  fc  =  n  —  2,  n  —  3, . . . ,  1.  (5.83) 

As  the  number  of  nodes  becomes  large,  the  preceding  elimination/factorization  ap¬ 
proach  to  solving  the  linear  system  becomes  increasingly  inefficient,  and  one  often  switches 
to  an  iterative  solution  method  such  as  Gauss-Seidel,  Jacobi,  or,  even  better,  Successive 
Over-Relaxation  (SOR);  indeed,  SOR  was  originally  designed  to  speed  up  the  solution 
of  the  large-scale  linear  systems  arising  from  the  numerical  solution  of  elliptic  partial 
differential  equations.  Detailed  discussions  of  iterative  matrix  methods  can  be  found  in 
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89;  Chapter  10]  and  [118].  For  the  SOR  method,  a  good  choice  for  the  relaxation  param¬ 
eter  is 

4 

ca  =  -  = .  (5.84) 


2  +  ^4  —  cos2(7r/m)  —  cos2(7r/n) 


Iterative  solution  methods  are  even  more  attractive  in  dealing  with  irregular  domains, 
whose  finite  difference  coefficient  matrix,  while  still  sparse,  is  less  structured  than  in  the 
rectangular  case,  and  hence  less  amenable  to  fast  Gaussian  Elimination  algorithms. 

Finally,  let  us  address  the  question  of  unique  solvability  of  the  finite  difference  linear 
system  obtained  by  discretization  of  the  Poisson  equation  on  a  bounded  domain  subject  to 
Dirichlet  boundary  conditions.  As  in  the  Uniqueness  Theorem  4.10  for  the  original  bound¬ 
ary  value,  this  will  follow  from  an  easily  established  Maximum  Principle  for  the  discrete 
system  that  directly  mimics  the  Laplace  equation  maximum  principle  of  Theorem  4.9. 

Theorem  5.8.  Let  Q  be  a  bounded  domain.  Then  the  hnite  difference  linear  system 
(5.74)  has  a  unique  solution. 

Proof :  The  result  will  follow  if  we  can  prove  that  the  only  solution  to  the  corresponding 
homogeneous  linear  system  Aw  =  0  is  the  trivial  solution  w  =  0.  The  homogeneous 
system  corresponds  to  discretizing  the  Laplace  equation  subject  to  zero  Dirichlet  boundary 
conditions. 

Now,  in  view  of  (5.71),  each  equation  in  the  homogeneous  linear  system  can  be  written 
in  the  form 


uij  = 


ui-ij + ui+i,j  +  p2ui,j~  i  +  p2ui,j+ 1 


2(1  +  P2) 


(5.85) 


If  p  =  1,  then  (5.85)  says  that  the  value  of  ui  •  at  the  node  {xil  y-)  is  equal  to  the  average 
of  the  values  at  the  four  neighboring  nodes.  For  general  p,  it  says  that  u{  -  is  a  weighted 
average  of  the  four  neighboring  values.  In  either  case,  the  value  of  ui^  must  lie  strictly 


between  the  maximum  and  minimum  values  of  ui_1  ~,ui+1  jlui  ■1 

9  '  9  9  «y 


and  u- 


bj+i 


unless 


all  these  values  are  the  same,  in  which  case  u-  •  also  has  the  same  value.  This  observation 

L  9  J 

suffices  to  establish  a  Maximum  Principle  for  the  finite  difference  system  for  the  Laplace 
equation  —  namely,  that  its  solution  cannot  achieve  a  local  maximum  or  minimum  at  an 
interior  node. 

Now  suppose  that  the  homogeneous  finite  difference  system  Aw  =  0  for  the  domain 
has  a  nontrivial  solution  w  ^  0.  Let  ui  •  =  wk  be  the  maximal  entry  of  this  purported 
solution.  The  Maximum  Principle  requires  that  all  four  of  its  neighboring  values  must  have 
the  same  maximal  value.  But  then  the  same  argument  applies  to  the  neighbors  of  those 
entries,  to  their  neighbors,  and  so  on.  Eventually  one  of  the  neighbors  is  at  a  boundary 
node,  but,  since  we  are  dealing  with  the  homogeneous  Dirichlet  boundary  value  problem, 
its  value  is  zero.  This  immediately  implies  that  all  the  entries  of  w  must  be  zero,  which  is 
a  contradiction.  Q.E.D. 


Rigorously  establishing  convergence  of  the  finite  difference  solution  to  the  analytic 
solution  to  the  boundary  value  problem  as  the  step  size  goes  to  zero  will  not  be  discussed 
here,  and  we  refer  the  reader  to  [6,  80]  for  precise  results  and  proofs. 
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Exercises 

4b  5.5.1.  Solve  the  Dirichlet  problem  Au  =  0,  u(x,  0)  =  sin  x,  u(x,  tt)  =  0,  u(0,y)  =  0, 

u(7r,y)  =  0,  numerically  using  a  finite  difference  scheme.  Compare  your  approximation  with 
the  solution  you  obtained  in  Exercise  4.3.10(a). 

4b  5.5.2.  Solve  the  Dirichlet  problem  Au  =  0,  u(x,  0)  =  x,  u(x,  1)  =  1  —  x,  u(0,y)  =  y,  u(l,y)  = 
1  —  y,  numerically  via  finite  differences.  Compare  your  approximation  with  the  solution  you 
obtained  in  Exercise  4.3.12(d). 

4b  5.5.3.  Consider  the  Dirichlet  boundary  value  problem  Au  =  0  u(x,  0)  =  sinx,  u(x,tt)  =  0, 

u(0,y)  =  0,  u(ir,y)  =  0,  on  the  square  {0  <  x,y  <  tt}.  (a)  Find  the  exact  solution,  (b)  Set 
up  and  solve  the  finite  difference  equations  based  on  a  square  mesh  with  m  =  n  =  2  squares 
on  each  side  of  the  full  square.  How  close  is  this  value  to  the  exact  solution  at  the  center  of 
the  square:  ^7r)?  (c)  Repeat  part  (b)  for  m  =  n  =  4  squares  per  side.  Is  the  value 

of  your  approximation  at  the  center  of  the  unit  square  closer  to  the  true  solution?  (d)  Use 
a  computer  to  find  a  finite  difference  approximation  to  u^tt,  \t r)  using  m  =  n  =  8  and 
16  squares  per  side.  Is  your  approximation  converging  to  the  exact  solution  as  the  mesh 
becomes  finer  and  finer?  Is  the  convergence  rate  consistent  with  the  order  of  the  finite  dif¬ 
ference  approximation? 

4b  5.5.4.  (a)  Use  finite  differences  to  approximate  a  solution  to  the  Helmholtz  boundary  value 
problem  Au  =  u,  u(x,  0)  =  u(x,  1)  =  u(0,y)  =  0,  u(l,y)  =  1,  on  the  unit  square 
0  <  x,y  <  1.  (b)  Use  separation  of  variables  to  construct  a  series  solution.  Do  your  ana¬ 
lytic  and  numerical  solutions  match?  Explain  any  discrepancies. 

4b  5.5.5.  A  drum  is  in  the  shape  of  an  L,  as  in  the  accompanying  figure,  whose 
short  sides  all  have  length  1.  (a)  Use  a  finite  difference  scheme  with  mesh 
spacing  Ax  =  Ay  =  .1  to  find  and  graph  the  equilibrium  configuration 
when  the  drum  is  subject  to  a  unit  upwards  force  while  all  its  sides  are 
fixed  to  the  (x,  y)-plane.  What  is  the  maximal  deflection,  and  at  which 
point (s)  does  it  occur?  (b)  Check  the  accuracy  of  your  answer  in  part  (a) 
by  reducing  the  step  size  by  half:  Ax  =  Ay  =  .05. 

4  5.5.6.  A  metal  plate  has  the  shape  of  a  3  cm  square  with  a  1  cm  square  hole  cut  out  of  the 
middle.  The  plate  is  heated  by  making  the  inner  edge  have  temperature  100°  while  keep¬ 
ing  the  outer  edge  at  0°.  (a)  Find  the  (approximate)  equilibrium  temperature  using  finite 
differences  with  a  mesh  width  of  Ax  =  Ay  =  .5  cm.  Plot  your  approximate  solution  us¬ 
ing  a  three-dimensional  graphics  program,  (b)  Let  C  denote  the  square  contour  lying  mid¬ 
way  between  the  inner  and  outer  square  boundaries  of  the  plate.  Using  your  finite  differ¬ 
ence  approximation,  determine  at  what  point(s)  on  C  the  temperature  is  (i)  minimized; 

(ii)  maximimized;  (in)  equal  to  the  average  of  the  two  boundary  temperatures. 

(c)  Repeat  part  (a)  using  a  smaller  mesh  width  of  Ax  =  Ay  =  .2.  How  much  does  this 
affect  your  answers  in  part  (b)? 

£  5.5.7.  Answer  Exercise  5.5.6  when  the  plate  is  additionally  subjected  to  a  constant  heat  source 

/(x,  y)  =  600x  +  800 y  —  2400. 

4b  5.5.8.  (a)  Explain  how  to  adapt  the  finite  difference  method  to  a  mixed  boundary  value 

problem  on  a  rectangle  with  inhomogeneous  Neumann  conditions.  Hint :  Use  a  one-sided 
difference  formula  of  the  appropriate  order  to  approximate  the  normal  derivative  at  the 
boundary,  (b)  Apply  your  method  to  the  problem 

du 

Au  =  0,  u(x,  0)  =  0,  u(x,  1)  =  0,  —(0,y)  =  y(l-y),  u{l,y)  =  0, 

using  mesh  sizes  Ax  =  Ay  =  .1,  .01,  and  .001.  Compare  your  answers,  (c)  Solve  the 
boundary  value  problem  via  separation  of  variables,  and  compare  the  value  of  the  solution 
and  the  numerical  approximations  at  the  center  of  the  square. 


Chapter  6 

Generalized  Functions  and  Green’s  Functions 


Boundary  value  problems,  involving  both  ordinary  and  partial  differential  equations,  can 
be  profitably  viewed  as  the  infinite-dimensional  function  space  versions  of  finite-dimen¬ 
sional  systems  of  linear  algebraic  equations.  As  a  result,  linear  algebra  not  only  provides 
us  with  important  insights  into  their  underlying  mathematical  structure,  but  also  motivates 
both  analytical  and  numerical  solution  techniques.  In  the  present  chapter,  we  develop  the 
method  of  Green’s  functions,  pioneered  by  the  early-nineteenth-century  self-taught  English 
mathematician  (and  miller!)  George  Green,  whose  famous  Theorem  you  already  encoun¬ 
tered  in  multivariable  calculus.  We  begin  with  the  simpler  case  of  ordinary  differential 
equations,  and  then  move  on  to  solving  the  two-dimensional  Poisson  equation,  where  the 
Green’s  function  provides  a  powerful  alternative  to  the  method  of  separation  of  variables. 

For  inhomogeneous  linear  systems,  the  basic  Superposition  Principle  says  that  the 
response  to  a  combination  of  external  forces  is  the  self-same  combination  of  responses  to  the 
individual  forces.  In  a  finite-dimensional  system,  any  forcing  function  can  be  decomposed 
into  a  linear  combination  of  unit  impulse  forces,  each  applied  to  a  single  component  of  the 
system,  and  so  the  full  solution  can  be  obtained  by  combining  the  solutions  to  the  individual 
impulse  problems.  This  simple  idea  will  be  adapted  to  boundary  value  problems  governed 
by  differential  equations,  where  the  response  of  the  system  to  a  concentrated  impulse 
force  is  known  as  the  Green’s  function.  With  the  Green’s  function  in  hand,  the  solution 
to  the  inhomogeneous  system  with  a  general  forcing  function  can  be  reconstructed  by 
superimposing  the  effects  of  suitably  scaled  impulses.  Understanding  this  construction  will 
become  increasingly  important  as  we  progress  to  partial  differential  equations,  where  direct 
analytic  solution  techniques  are  far  harder  to  come  by. 

The  obstruction  blocking  a  direct  implementation  of  this  idea  is  that  there  is  no 
ordinary  function  that  represents  an  idealized  concentrated  impulse!  Indeed,  while  this 
approach  was  pioneered  by  Green  and  Cauchy  in  the  early  1800s,  and  then  developed 
into  an  effective  computational  tool  by  Heaviside  in  the  1880s,  it  took  another  60  years 
before  mathematicians  were  able  to  develop  a  completely  rigorous  theory  of  generalized 
functions ,  also  known  as  distributions.  In  the  language  of  generalized  functions,  a  unit 
impulse  is  represented  by  a  delta  function  f  While  we  do  not  have  the  analytic  tools  to 
completely  develop  the  mathematical  theory  of  generalized  functions  in  its  full,  rigorous 
glory,  we  will  spend  the  first  section  learning  the  basic  concepts  and  developing  the  practical 
computational  skills,  including  Fourier  methods,  required  for  applications.  The  second 


t  Warning :  We  follow  common  practice  and  refer  to  the  “delta  distribution” 

even  though,  as  we  will  see,  it  is  most  definitely  not  a  function  in  the  usual  sense. 


as  a  function, 
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section  will  discuss  the  method  of  Green’s  functions  in  the  context  of  one-dimensional 
boundary  value  problems  governed  by  ordinary  differential  equations.  In  the  final  section, 
we  develop  the  Green’s  function  method  for  solving  basic  boundary  value  problems  for  the 
two-dimensional  Poisson  equation,  which  epitomizes  the  class  of  planar  elliptic  boundary 
value  problems. 


6.1  Generalized  Functions 


Our  goal  is  to  solve  inhomogeneous  linear  boundary  value  problems  by  first  determining 
the  effect  of  a  concentrated  impulse  force.  The  response  to  a  general  forcing  function  is 
then  found  by  linear  superposition.  But  before  diving  in,  let  us  first  review  the  relevant 
constructions  in  the  case  of  linear  systems  of  algebraic  equations. 

Consider  a  system  of  n  linear  equations  in  n  unknowns^  u  =  ( u2, . . . ,  un  )T,  written 
in  matrix  form 

Au  =  f.  (6.1) 


Here  A  is  a  fixed  n  x  n  matrix,  assumed  to  be  nonsingnlar,  which  ensures  the  existence 
of  a  unique  solution  u  for  any  choice  of  right-hand  side  f  =  ( /l7  /2,  •  •  • ,  fn  )T  £  Mn.  We 
regard  the  linear  system  (6.1)  as  representing  the  equilibrium  equations  of  some  physical 
system,  e.g.,  a  system  of  masses  interconnected  by  springs.  In  this  context,  the  right  hand 
side  f  represents  an  external  forcing,  so  that  its  ith  entry,  fil  represents  the  amount  of  force 
exerted  on  the  ith  mass,  while  the  ith  entry  of  the  solution  vector,  uil  represents  the  zth 
mass’  induced  displacement. 

Let 


denote  the  standard  basis  vectors  of  Mn,  so  that  e-  has  a  single  1  in  its  jth  entry  and  all 
other  entries  0.  We  interpret  each  e  •  as  a  concentrated  unit  impulse  force  that  is  applied 
solely  to  the  jth  mass  in  our  physical  system.  Let  =  {u-  n)T  be  the  induced 

response  of  the  system,  that  is,  the  solution  to 


Auj=ei-  (6-3) 

Let  us  suppose  that  we  have  calculated  the  response  vectors  ul5 . . . ,  un  to  each  such  impulse 
force.  We  can  express  any  other  force  vector  as  a  linear  combination, 


/  lel  +  /2e2  + 


to 


^  All  vectors  are  column  vectors,  but  we  sometimes  write  the  transpose,  which  is  a  row  vector, 
save  space. 
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of  the  impulse  forces.  The  Superposition  Principle  of  Theorem  1.7  then  implies  that  the 
solution  to  the  inhomogeneous  system  (6.1)  is  the  selfsame  linear  combination  of  the  indi¬ 
vidual  impulse  responses: 


u  =  /iui  +  /2u2+  •••  +  fnun.  (6.5) 

Thus,  knowing  how  the  linear  system  responds  to  each  impulse  force  allows  us  to  immedi¬ 
ately  calculate  its  response  to  a  general  external  force. 

Remark :  The  alert  reader  will  recognize  that  ul5 . . . ,  un  are  the  columns  of  the  inverse 
matrix,  A-1,  and  so  formula  (6.5)  is,  in  fact,  reconstructing  the  solution  to  the  linear  system 
(6.1)  by  inverting  its  coefficient  matrix:  u  =  A-1  f.  Thus,  this  observation  is  merely  a 
restatement  of  a  standard  linear  algebraic  system  solution  technique. 

The  Delta  Function 

The  aim  of  this  chapter  is  to  adapt  the  preceding  algebraic  solution  technique  to  boundary 
value  problems.  Suppose  we  want  to  solve  a  linear  boundary  value  problem  governed  by 
an  ordinary  differential  equation  on  an  interval  a  <  x  <  6,  the  boundary  conditions  being 
imposed  at  the  endpoints.  The  key  issue  is  how  to  characterize  an  impulse  force  that  is 
concentrated  at  a  single  point. 

In  general,  a  unit  impulse  at  position  a  <  S,  <  b  will  be  described  by  something  called 
the  delta  function ,  and  denoted  by  5^(x).  Since  the  impulse  is  supposed  to  be  concentrated 
solely  at  x  =  £,  our  first  requirement  is 

S^(x)  =  0  for  x  7^  £.  (6.6) 

Moreover,  since  the  delta  function  represents  a  unit  impulse,  we  want  the  total  amount 
of  force  to  be  equal  to  one.  Since  we  are  dealing  with  a  continuum,  the  total  force  is 
represented  by  an  integral  over  the  entire  interval,  and  so  we  also  require  that  the  delta 
function  satisfy 

fb 

/  5Ax)dx  =  1,  provided  a  <  £  <  b.  (6.7) 

J  a 

Alas,  there  is  no  bona  fide  function  that  enjoys  both  of  the  required  properties!  Indeed, 
according  to  the  basic  facts  of  Riemann  (or  even  Lebesgue)  integration,  two  functions  that 
are  the  same  everywhere  except  at  a  single  point  have  exactly  the  same  integral,  [96,98]. 
Thus,  since  5 ^  is  zero  except  at  one  point,  its  integral  should  be  0,  not  1.  The  mathematical 
conclusion  is  that  the  two  requirements,  (6.6-7)  are  inconsistent! 

This  unfortunate  fact  stopped  mathematicians  dead  in  their  tracks.  It  took  the  imagi¬ 
nation  of  a  British  engineer,  Oliver  Heaviside,  who  was  not  deterred  by  the  lack  of  rigorous 
justification,  to  start  utilizing  delta  functions  in  practical  applications  —  with  remarkable 
effect.  Despite  his  success,  Heaviside  was  ridiculed  by  the  mathematicians  of  his  day,  and 
eventually  succumbed  to  mental  illness.  But,  some  thirty  years  later,  the  great  British 
theoretical  physicist  Paul  Dirac  resurrected  the  delta  function  for  quantum-mechanical  ap¬ 
plications,  and  this  finally  made  the  mathematicians  sit  up  and  take  notice.  (Indeed,  the 
term  “Dirac  delta  function”  is  quite  common,  even  though  Heaviside  should  rightly  have 
priority.)  In  1944,  the  French  mathematician  Laurent  Schwartz  finally  established  a  rigor¬ 
ous  theory  of  distributions  that  incorporated  such  useful  but  nonstandard  objects,  [103  . 
Thus,  to  be  more  accurate,  we  should  really  refer  to  the  delta  distribution ;  however,  we 
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will  retain  the  more  common,  intuitive  designation  “delta  function”  throughout.  It  is  be¬ 
yond  the  scope  of  this  introductory  text  to  develop  a  fully  rigorous  theory  of  distributions. 
Rather,  in  the  spirit  of  Heaviside,  we  shall  concentrate  on  learning,  through  practice  with 
computations  and  applications,  how  to  make  effective  use  of  these  exotic  mathematical 
creatures. 

There  are  two  possible  ways  to  introduce  the  delta  distribution.  Both  are  important 
and  worth  understanding. 


Method  #1.  Limits :  The  first  approach  is  to  regard  the  delta  function  Sg(x)  as  a 

limit  of  a  sequence  of  ordinary  smooth  functions^  gn(x).  These  will  represent  progressively 
more  and  more  concentrated  unit  forces,  which,  in  the  limit,  converge  to  the  desired  unit 
impulse  concentrated  at  a  single  point,  x  =  £.  Thus,  we  require 


lim 

n  — >  oo 


9n(X)  =  °> 


while  the  total  amount  of  force  remains  fixed  at 


x  7^  £? 


gn(x)  dx  —  1  for  all  n. 


On  a  formal  level,  the  limit  “function” 


SJx)  =  lim  g  (x) 

s  Tl  — y  OO 

will  satisfy  the  key  properties  (6.6-7). 

An  explicit  example  of  such  a  sequence  is  provided  by  the  rational  functions 


9n(X)  = 


n 


7r(l  +  n2x2) 


(6.10) 


These  functions  satisfy 


lim  g  ( x) 

n  — y  oo 


0,  X  /  0. 


oo,  x  =  0, 


(6.11) 


while^ 


‘OO 


—  oo 


1 

gn{x)  dx  =  —  tan  nx 

IT 


oo 


x  =  — oo 


=  1 


(6.12) 


Therefore,  formally,  we  identify  the  limiting  function 


lim 

n  — >  oo 


9n(x)  =  6(x)  =  6o(x) 


(6.13) 


with  the  unit-impulse  delta  function  concentrated  at  x  =  0.  As  sketched  in  Figure  6.1,  as  n 
gets  larger  and  larger,  each  successive  function  gn(x)  forms  a  more  and  more  concentrated 
spike,  while  maintaining  a  unit  total  area  under  its  graph.  Thus,  the  limiting  delta  function 
can  be  thought  of  as  an  infinitely  tall  spike  of  zero  width,  entirely  concentrated  at  the  origin. 


^  To  keep  the  notation  compact,  we  suppress  the  dependence  of  the  functions  gn  on  the  point 
£  where  the  limiting  delta  function  is  concentrated. 

^  For  the  moment,  it  will  be  slightly  simpler  to  consider  the  entire  real  line  —  oo  <  x  <  oo. 
Exercise  6.1.8  discusses  how  to  adapt  the  construction  to  a  finite  interval. 
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Remark :  There  are  many  other  possible  choices  for  the  limiting  functions  gn(x).  See 
Exercise  6.1.7  for  another  important  example. 


Remark :  This  construction  of  the  delta  function  highlights  the  perils  of  interchanging 
limits  and  integrals  without  rigorous  justification.  In  any  standard  theory  of  integration 
(Riemann,  Lebesgue,  etc.),  the  limit  of  the  functions  gn  would  be  indistinguishable  from 
the  zero  function,  so  the  limit  of  their  integrals  (6.12)  would  not  equal  the  integral  of  their 
limit: 

poo  poo 


1  = 


lim  /  gn(x)  dx  7^  /  lim  gn(x)  dx  =  0. 

n  — >  oo  I  ^  /  _  n  — »•  oo 

J  —  oo  J  —  oo 


The  delta  function  is,  in  a  sense,  a  means  of  sidestepping  this  analytic  inconvenience.  The 
full  ramifications  and  theoretical  constructions  underlying  such  limits  must,  however,  be 
deferred  to  a  rigorous  course  in  real  analysis,  [96,98]. 


Once  we  have  defined  the  basic  delta  function  5{x)  =  50(x)  concentrated  at  the  ori¬ 
gin,  we  can  obtain  the  delta  function  concentrated  at  any  other  position  £  by  a  simple 
translation: 

S^(x)  =  5(x  —  £).  (6.14) 

Thus,  5^(x)  can  be  realized  as  the  limit,  as  n  oo,  of  the  translated  functions 


9n(X)  =9n(x~0 


n 


7 r 


1  +  n2[x 


(6.15) 


Method  #2.  Duality :  The  second  approach  is  a  bit  more  abstract,  but  much  closer 
in  spirit  to  the  proper  rigorous  formulation  of  the  theory  of  distributions  like  the  delta 
function.  The  critical  property  is  that  if  u(x)  is  any  continuous  function,  then 


5t(x)  u{x)  dx  =  u(^) 


for  a  <  £  <  b. 


(6.16) 
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Indeed,  since  fig(x)  =  0  for  x  £,  the  integrand  depends  only  on  the  value  of  u  at  the 
point  x  =  £,  and  so 

nb  nb  nb 

/  5Ax)u{x)dx  =  /  Sfix)  u(£)  dx  =  u(^)  /  5Rx)  dx  =  u(£). 

J  CL  J  CL  J  CL 

Equation  (6.16)  serves  to  dehne  a  linear  functional^  L^:  C °[a,b]  — >>  M  that  maps  a  contin¬ 
uous  function  u  £  C°[a,  6]  to  its  value  at  the  point  x  =  £ : 


The  basic  linearity  requirements  (1.11)  are  immediately  established: 


(6.17) 


u  +  v 


u{0  +  v(0 


L 


a 


u 


+  L 


v 


L^[cu]  =  cu(£)  =  cL ^ 


u 


for  any  functions  u(x),v(x).  In  the  dual  approach  to  generalized  functions,  the  delta 
function  is,  in  fact,  defined  as  this  particular  linear  functional  (6.17).  The  function  u{x) 
is  sometimes  referred  to  as  a  test  function ,  since  it  serves  to  “test”  the  form  of  the  linear 
functional  L^. 

Remark :  If  the  impulse  point  £  lies  outside  the  integration  domain,  then 


fb 

/  5  fix)  u(x)  dx  =  0  whenever  £  <  a  or  £  >  6,  (6.18) 

J  a 

because  the  integrand  is  identically  zero  on  the  entire  interval.  For  technical  reasons,  we 
will  not  attempt  to  dehne  the  integral  (6.18)  if  the  impulse  point  £  =  a  or  £  =  b  lies  on  the 
boundary  of  the  interval  of  integration. 

The  interpretation  of  the  linear  functional  L ^  as  representing  a  kind  of  function  Sg(x) 
is  based  on  the  following  line  of  thought.  According  to  Corollary  B.34,  every  scalar-valued 
linear  function  L :  IRn  -£  M  on  the  finite-dimensional  vector  space  Mn  is  obtained  by  taking 
the  dot  product  with  a  fixed  element  a  £  IRn,  so 


=  a  •  Vi. 


In  this  sense,  linear  functions  on  IRn  are  the  “same”  as  vectors.  Similarly,  on  the  infinite¬ 
dimensional  function  space  C°[a,  6],  the  L2  inner  product 


L 


u 


(g,u 


g(x)  u(x)  dx. 


(6.19) 


a 


taken  with  a  fixed  continuous  function  g  £  C°[a,  6],  defines  a  real-valued  linear  functional 
L^:C°[a,6]  -£  M.  However,  unlike  the  hnite-dimensional  situation,  not  every  real-valued 
linear  functional  is  of  this  form!  In  particular,  there  is  no  bona  fide  function  5^(x)  such 
that  the  identity 


L 


G 


u 


{5,,u 


<5g(x)  u(x)  dx  =  u(£) 


(6.20) 


a 


holds  for  every  continuous  function  u[x).  The  bottom  line  is  that  every  (continuous) 
function  defines  a  linear  functional,  but  not  every  linear  functional  arises  in  this  manner. 


^  The  term  “functional”  is  used  to  refer  to  a  linear  function  whose  domain  is  a  function  space, 
thus  avoiding  confusion  with  the  functions  it  acts  on. 
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But  the  dual  interpretation  of  generalized  functions  acts  as  if  this  were  true.  Gen¬ 
eralized  functions  are ,  in  actuality ,  real-valued  linear  functionals  on  function  space,  but 
intuitively  interpreted  as  a  kind  of  function  via  the  L2  inner  product.  Although  this  iden¬ 
tification  is  not  to  be  taken  too  literally,  one  can,  with  some  care,  manipulate  generalized 
functions  as  if  they  were  actual  functions,  but  always  keeping  in  mind  that  a  rigorous 
justification  of  such  computations  must  ultimately  rely  on  their  innate  characterization  as 
linear  functionals. 

The  two  approaches  —  limits  and  duality  —  are  completely  compatible.  Indeed,  one 
can  recover  the  dual  formula  (6.20)  as  the  limit 


u{0  =  lim  (g 

n  — >  oo 


n  i 


U 


=  lim 

n  — >  oo 


gn(x)  u(x)  dx  =  /  5^(x)  u(x)  dx  =  (5^  , 


u 


(6.21) 


of  the  inner  products  of  the  function  u  with  the  approximating  concentrated  impulse  func¬ 
tions  gn(x )  satisfying  (6.8-9).  In  this  manner,  the  limiting  linear  functional  represents  the 
delta  function: 


u(^)  =  LAu ]  =  lim  Ln 

^  n  — >  oo 


U 


where 


L 


n  l 


U 


gn(x)  u(x)  dx, 


The  choice  of  interpretation  of  the  generalized  delta  function  is,  at  least  on  an  operational 
level,  a  matter  of  taste.  For  the  beginner,  the  limit  version  is  perhaps  easier  to  digest 
initially.  However,  the  dual,  linear  functional  interpretation  has  stronger  connections  with 
the  rigorous  theory  and,  even  in  applications,  offers  some  significant  advantages. 

Although  the  delta  function  might  strike  you  as  somewhat  bizarre,  its  utility  through¬ 
out  modern  applied  mathematics  and  mathematical  physics  more  than  justifies  including 
it  in  your  analytical  toolbox.  While  probably  not  yet  comfortable  with  either  definition, 
you  are  advised  to  press  on  and  familiarize  yourself  with  its  basic  properties.  With  a  little 
care,  you  usually  won’t  go  far  wrong  by  treating  it  as  if  it  were  a  genuine  function.  After 
you  gain  more  practical  experience,  you  can,  if  desired,  return  to  contemplate  just  exactly 
what  kind  of  creature  the  delta  function  really  is. 


Calculus  of  Generalized  Functions 


In  order  to  make  use  of  the  delta  function,  we  need  to  understand  how  it  behaves  under 
the  basic  operations  of  linear  algebra  and  calculus.  First,  we  can  take  linear  combinations 
of  delta  functions.  For  example, 

h{x)  =  2  5{x)  —  3  5{x  —  1)  =  2  d0(x)  —  3  S^x) 

represents  a  combination  of  an  impulse  of  magnitude  2  concentrated  at  x  =  0  and  one 
of  magnitude  —3  concentrated  at  x  =  1.  In  the  dual  interpretation,  h  defines  the  linear 
functional 


Lh[u]  =  (h  ,u)  =  ( 2  <50  —  3  5±  ,u)  =  2  (50  ,u)  —  3  (51,u)  =  2u(0)  —  3^(1) 
or,  more  explicitly,  provided  a  <  0  and  b  >  1, 

rb  rb 

L 


hi 


u 


nu  nu 

/  h(x)  u{x)  dx  =  /  2  5(x)  —  3  5{x  —  1)  ]  u(x)  dx 

J  a  J  a 


s>b  s>b 

2  /  S(x)u(x)dx  —  3  /  5{x  —  1)  u{x)  dx  =  2^(0)  —  3u(l) 

J  a  J  a 
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Figure  6.2.  Step  function  as  limit. 


Next,  since  S^(x)  =  0  for  any  x  ^  £,  multiplying  the  delta  function  by  an  ordinary 
function  is  the  same  as  multiplying  by  a  constant: 

g(x)  5^(x)  =  p(£)  S^(x),  (6.22) 

provided  g(x)  is  continuous  at  x  =  £.  For  example,  x5(x)  =  0  is  the  same  as  the  constant 
zero  function. 


Warning :  Since  they  are  inherently  linear  functionals,  it  is  not  permissible  to  multi¬ 
ply  delta  functions  together,  or  to  apply  more  complicated  nonlinear  operations  to  them. 
Expressions  like  5(x)2,  l/5(x),  e5<yX\  etc.,  are  not  well  defined  in  the  theory  of  general¬ 
ized  functions  —  although  this  makes  their  application  to  nonlinear  differential  equations 
problematic. 


The  integral  of  the  delta  function  is  the  unit  step  function : 


0,  x  <  £, 

1,  x  £, 


provided  a  <  £. 


(6.23) 


Unlike  the  delta  function,  the  step  function  cr^(x)  is  an  ordinary  function.  It  is  continuous 
—  indeed  constant  —  except  at  x  =  £.  The  value  of  the  step  function  at  the  discontinuity 
x  =  £  is  left  unspecified,  although  a  wise  choice  —  compatible  with  Fourier  theory  —  is  to 
set  cr^(y)  =  the  average  of  its  left-  and  right-hand  limits. 

We  note  that  the  integration  formula  (6.23)  is  compatible  with  our  characterization  of 
the  delta  function  as  the  limit  of  highly  concentrated  forces.  Integrating  the  approximating 
functions  (6.10),  we  obtain 


/X 

9nX) 

-  oo 


,  1  ,  1 
at  —  —  tan  nx  H —  . 

7T  2 


Since 


1  1 

lim  tan-  H  =  ^  T  while 


y  —t  oo 


lim  tan  y 

y  — >  —  oo 


-  iyr, 


these  functions  converge  (nonuniformly)  to  the  step  function: 


lim 

n  — >  oo 


0,  x  <  0, 

7j,  x  =  0, 

1,  x  >  0. 


(6.24) 
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Figure  6.3.  First  and  second-order  ramp  functions. 


A  graphical  illustration  of  this  limiting  process  appears  in  Figure  6.2. 

The  integral  of  the  discontinuous  step  function  (6.23)  is  the  continuous  ramp  function 


p^x)  =  p(x-Q 


0,  x  <  £, 

X  -  C  X  >  £, 


provided  a  <  £,  (6.25) 


which  is  graphed  in  Figure  6.3.  Note  that  p^(x)  has  a  corner  at  x  =  £,  and  so  is  not 
differentiable  there;  indeed,  its  derivative  p'(x  —  £)  =  a(x  —  £)  has  a  jump  discontinuity. 
We  can  continue  to  integrate;  the  (n  +  l)st  integral  of  the  delta  function  is  the  nth  order 
ramp  function 

(  0,  x  <  £, 

Pn,a(x)  =  Pn(x~0  =  <  (x-0n  .  (6-26) 


Note  that  pn  ^  G  Cn  1  has  only  n  —  1  continuous  derivatives. 


What  about  differentiation?  Motivated  by  the  Fundamental  Theorem  of  Calculus, 
we  shall  use  formula  (6.23)  to  identify  the  derivative  of  the  step  function  with  the  delta 
function 


dcr 

dx 


(6.27) 


This  fact  is  highly  significant.  In  elementary  calculus,  one  is  not  allowed  to  differentiate 
a  discontinuous  function.  Here,  we  discover  that  the  derivative  can  be  defined,  not  as  an 
ordinary  function,  but  rather  as  a  generalized  delta  function! 

In  general,  the  derivative  of  a  piecewise  C1  function  with  jump  discontinuities  is  a  gen¬ 
eralized  function  that  includes  a  delta  function  concentrated  at  each  discontinuity,  whose 
magnitude  equals  the  jump  magnitude.  More  explicitly,  suppose  that  f(x)  is  differen¬ 
tiable,  in  the  usual  calculus  sense,  everywhere  except  at  a  point  £,  where  it  has  a  jump 
discontinuity  of  magnitude  (3.  Using  the  step  function  (3.47),  we  can  re-express 


f{x)  =  g(x)  +  j3a{x  -  £),  (6.28) 

where  g(x)  is  continuous  everywhere,  with  a  removable  discontinuity  at  x  =  £,  and  differ¬ 
entiable  except  possibly  at  the  jump.  Differentiating  (6.28),  we  find  that 

f{x)=g'(x)  +  j5  8(x-0  (6.29) 

has  a  delta  spike  of  magnitude  (3  at  the  discontinuity.  Thus,  the  derivatives  of  /  and  g 
coincide  everywhere  except  at  the  discontinuity. 
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/(*) 


Figure  6.4.  The  derivative  of  the  discontinuous  function  in  Example  6.1. 


Example  6.1.  Consider  the  function 


f{x) 


—  x , 

1  2 

5«Xy 


X  <  1. 
X  >  1. 


(6.30) 


which  we  graph  in  Figure  6.4.  We  note  that  /  has  a  single  jump  discontinuity  at  x  =  1  of 
magnitude 

/(i+)-/(n  =  f -(-!)  =  !• 

This  means  that 

f  — a:  <  1, 

/(®)=0(®)+5<4®-1),  where  9(x)  =  <  1  2_6  ^  1 

1  5  ^  5  ?  X  ^  -L  7 

is  continuous  everywhere,  since  its  right-  and  left-hand  limits  at  the  original  discontinuity 
are  equal:  g(  1+)  =  g(  1~)  =  —1.  Therefore, 

—  1,  x  <  1, 

■  /Wo™  —  1  ^  whprp  —  2 


f'{x)  =  g'(x)  +  I  S(x-  1) 


where  g'(%)  = 


2 

5 


X  >  1. 


while  g'(  1)  and  /'(l)  are  not  dehned.  In  Figure  6.4,  the  delta  spike  in  the  derivative  of  /  is 
symbolized  by  a  vertical  line,  although  this  pictorial  device  fails  to  indicate  its  magnitude 


of  | 

5 


Note  that  in  this  particular  example,  g'(x )  can  be  found  by  directly  differentiating 
the  formula  for  f(x).  Indeed,  in  general,  once  we  determine  the  magnitude  and  location 
of  the  jump  discontinuities  of  /(x),  we  can  compute  its  derivative  without  introducing  the 
auxiliary  function  g(x). 


Example  6.2.  As  a  second,  more  streamlined,  example,  consider  the  function 

{—x,  x  <  0, 

x2  —  1,  0  <  x  <  1, 

2e-x,  x>l, 

which  is  plotted  in  Figure  6.5.  This  function  has  jump  discontinuities  of  magnitude  —1  at 
x  =  0,  and  of  magnitude  2/e  at  x  =  1.  Therefore,  in  light  of  the  preceding  remark, 


5(x)  +  ^  S(x  -  1)  + 


-1, 

x  <  0, 

2x, 

0  <  x  <  1 

—  2e~x, 

x  >  1, 
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Figure  6.5.  The  derivative  of  the  discontinuous  function  in  Example  6.2. 


where  the  final  terms  are  obtained  by  directly  differentiating  f{x). 


Example  6.3.  The  derivative  of  the  absolute  value  function 


is  the  sign  function 


x  >  0, 
x  <  0, 


a'(x )  =  signx 


+ 1,  x  0, 

—  1,  x  <  0. 


(6.31) 


Note  that  there  is  no  delta  function  in  a'(x)  because  a(x)  is  continuous  everywhere.  Since 
signx  has  a  jump  of  magnitude  2  at  the  origin  and  is  otherwise  constant,  its  derivative  is 
twice  the  delta  function: 


d 

dx 


signx  =  2  S(x). 


Example  6.4.  We  are  even  allowed  to  differentiate  the  delta  function.  Its  first 
derivative  5\x)  can  be  interpreted  in  two  ways.  First,  as  the  limit  of  the  derivatives  of  the 
approximating  functions  (6.10): 


d5 

dx 


lim 

n->oo 


lim 

n  — >  oo 


—  2  n3  x 

7T (1  +  U2  X2)2 


(6.32) 


The  graphs  of  these  rational  functions  take  the  form  of  more  and  more  concentrated  spiked 
“doublets”,  as  illustrated  in  Figure  6.6.  To  determine  the  effect  of  the  derivative  on  a  test 
function  n(x),  we  compute  the  limiting  integral 


5'u)  = 


/OO  P  oo 

5\x)u(x)dx=  lim  /  g'(x)u(x)dx 

^  n  — y  oo  /  ^ 

-oo  J  — oo 

/OO  /'OO 

g  (x)  u' (x)  dx  =  —  /  5(x)  u' (x)  dx  =  —  u' ( 0) 

-oo  J  —  oo 


(6.33) 


The  middle  step  is  the  result  of  an  integration  by  parts,  noting  that  the  boundary  terms 
at  Too  vanish,  provided  that  u{x)  is  continuously  differentiable  and  bounded  as  \  x\  — oo. 
Pay  attention  to  the  minus  sign  in  the  final  answer. 
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In  the  dual  interpretation,  the  generalized  function  S'{x)  corresponds  to  the  linear 
functional 


L'[u]  =  —  ^(0)  =  ( S' ,  u )  =  f  S'(x)u(x ) 

J  a 


dx. 


where  a  <  0  <  b. 


(6.34) 


which  maps  a  continuously  differentiable  function  u(x)  to  minus  its  derivative  at  the  origin. 
We  note  that  (6.34)  is  compatible  with  a  formal  integration  by  parts: 


S' {x)  u(x)  dx  =  5{x)  u(x) 


a 


x  =  a 


/  5{x)  u' (x)  dx  =  —  u\ 0). 
J  a 


The  boundary  terms  at  x  —  a  and  x  —  b  automatically  vanish,  since  5(x)  =  0  for  x  /0. 


Remark :  While  we  can  test  the  delta  function  with  any  continuous  function,  we  are 
permitted  to  test  its  derivative  only  on  continuously  differentiable  functions.  To  avoid 
keeping  track  of  such  technicalities,  one  often  restricts  to  only  infinitely  differentiable  test 
functions. 


Warning :  The  functions  gn(x)  =  gn(x)  +g'  (x),  cf.  (6.10,32),  satisfy  lim  gn(x)  =  0 

/ex)  n  — >  oo 

gn(x)  dx  —  1.  However,  lim  gn  =  lim  gn  +  lim  g'n  =  5  +  S' . 

n  — >  oo  n  — y  oo  n  — >  oo 

-oo 

Thus,  our  original  conditions  (6.8-9)  are  not  in  fact  sufficient  to  characterize  whether  a 
sequence  of  functions  has  the  delta  function  as  a  limit.  To  be  absolutely  sure,  one  must, 
in  fact,  verify  the  more  comprehensive  limiting  formula  (6.21). 
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Exercises 


/7T  CZ 

5(x)cosxdx,  (b)  /  4(x)  (x  —  2)  dx, 

—  7 T  J  1 

(c)  f  5l(x)exdx ,  (d)  f  d(x  —  2)  logx  dx,  (e)  f  d(x— ^r)x2dx,  (f)  f  2 

w  0  w  1  «/  0  v  —  1  1  “I-  X 


1  d(x  +  2)  dx 
-l 


6.1.2.  Simplify  the  following  generalized  functions;  then  write  out  how  they  act  on  a  suitable 
test  function  u(x)\  (a)  exd(x),  (b)  xS(x-l),  (c)  3  dx(x)  —  3x  8_1(x), 


(d)  ,  (e)  (cosx)  5(x)  +  S(x  —  tv)  +  Six  +  tt)  ,  (f) 

x  +  1  L  J 

6.1.3.  Define  the  generalized  function  <p(x)  =  S(x  +  1)  —  S(x  —  1): 

(a)  as  a  limit  of  ordinary  functions;  (b)  using  duality. 


<*iQk)  ~  $2(x) 

x2  +  1 


6.1.4.  Find  and  sketch  a  graph  of  the  derivative  (in  the  context  of  generalized  functions)  of  the 
following  functions: 


(a)  /(x) 


(c)  h(x) 


x  ,  0  <  x  <  3, 

x ,  —  1  <  x  <  0, 

0,  otherwise, 
sin  7 ray  x  >  1, 

1  —  x2,  —  1  <  x  <  1. 

e  ,  x  <  —1, 


(0  5(x) 


sm 

0, 


<  5*3 


otherwise, 

X  <  —  7T, 


(d)  /c(x) 


smx, 

22  /  /  n 
X  —  7T  ,  —  7T  <  X  <  0, 

X  v  A 

e  ,  x  >  0. 


1x  +  1,  — 1  <  x  <  0, 

1  —  x,  0  <  x  <  1, 

0,  otherwise, 

— 2  <  x  <  2,  ,  ,  /  +  cos  — 1  <  x  <  1, 

otherwise,  C  S  X  1  0,  otherwise. 


(b)  /c(x) 


0. 


6.1.6.  Find  the  first  and  second  derivatives  of  /(x)  =  (a)  e  1  x  1  ,  (b)  2  |  x  \  —  \  x  —  1  |  , 

Q  O 

(c)  |  x  +  x  |  ,  (d)  xsign(x  —  4),  (e)  sin|x|,  (f)  |sinx|,  (g)  sign(sinx). 

Ti  2  2 

0  6.1.7.  Explain  why  the  Gaussian  functions  gn(x)  =  — =  e~n  x  have  the  delta  function  d(x)  as 
their  limit  as  n  — >•  oo.  v  ^ 

^  6.1.8.  In  this  exercise,  we  realize  the  delta  function  d^(x)  as  a  limit  of  functions  on  a  finite 
interval  [a,  b].  Let  a  <z<b. 

g  —  £) 

(a)  Prove  that  the  functions  gn(x)  =  71  - ,  where  gn(x)  is  given  by  (6.10)  and 

fb  Mfl  _ 

Mn  —  \  gn(x  —  £)  dx ,  satisfy  (6.8-9),  and  hence  lim  gn{x)  =  <G(x). 

J  a  n->oo  G 

rb 

(b)  One  can,  alternatively,  relax  the  second  condition  (6.9)  to  lim  /  gn(x  —  £)  dx  =  1. 

n  — »  oo  J  a 

Show  that,  under  this  relaxed  definition,  lim  gn(x  —  £)  =  d^(x). 


n  — >■  oo 


<  Vx 


(a)  Sketch  a  graph  of 


f  in 

O  6.1.9.  For  each  positive  integer  n,  let  g  (x)  =  <  2  5 

(  0,  otherwise. 

/X 

gn(y)  dy  and  sketch 

-  oo 

a  graph.  Does  the  sequence  /n(x)  converge  to  the  step  function  cr(x)  as  n  oo?  (d)  Find 
the  derivative  h  (x)  =  g'n{x).  (e)  Does  the  sequence  h  (x)  converge  to  Sf(x)  as  n  — >►  oo? 


O  6.1.10.  Answer  Exercise  6.1.9  for  the  hat  functions  gn(x) 


n  —  n 

0, 


x 


x  |  <  1/n. 
otherwise. 
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6.1.11.  Justify  the  formula  x8(x)  —  0  using  (a)  limits,  (b)  duality. 

0  6.1.12.  (a)  Justify  the  formula  8(2x)  =  ^  S(x)  by  (i)  limits,  (ii)  duality,  (b)  Find  a  similar 
formula  for  8 (ax)  when  a  >  0.  (c)  What  about  when  a  <  0? 

6.1.13.  (a)  Prove  that  a( Ax)  =  a(x)  for  any  A  >  0.  (b)  What  about  if  A  <  0?  (c)  Use  parts 

1 

(a ,b)  to  deduce  that  <5(Ax)  =  8(x)  for  any  A  /  0. 

6.1.14.  Let  g(x)  be  a  continuously  differentiable  function  with  g  (x)  A  0  for  all  xGl.  Does  the 
composition  8(g(x))  make  sense  as  a  distribution?  If  so,  can  you  identify  it? 

rX  rX 

6.1.15.  Let  £  <  a.  Sketch  the  graphs  of  (a)  s(x)  =  /  Se(z)dz,  (b)  r(x)  —  /  <jAz)dz. 

J a  ^  J  a  ^ 

6.1.16.  Justify  the  formula  lim  n  8(x  —  —  8(x  +  =  —  2  8'(x). 


n  — >  oo 


6.1.17.  Define  the  generalized  function  8"(x): 

(a)  as  a  limit  of  ordinary  functions;  (b)  using  duality. 

6.1.18.  Let  8^  (x)  denote  the  kth  derivative  of  the  delta  function  8^(x).  Justify  the  formula 

(8^  , u)  —  (~l)k  whenever  u  G  Ck  is  k- times  continuously  differentiable. 

6.1.19.  According  to  (6.22),  x8(x)  =  0.  On  the  other  hand,  by  Leibniz’  rule, 

(xJ(x))/  =  8(x)  +  xSr(x)  is  apparently  not  zero.  Can  you  explain  this  paradox? 

6.1.20.  If  /  e  C1,  should  (f8)'  =  fS'  or  f'S  +  fS'? 

A  6.1.21.  (a)  Use  duality  to  justify  the  formula  f(x)  S' (x)  =  /( 0)  8' (x)  —  f'( 0)  S(x)  when  /  G  C1. 

(b)  Find  a  similar  formula  for  f(x)  Slyn\x)  as  the  product  of  a  sufficiently  smooth  function 
and  the  nth  derivative  of  the  delta  function. 

6.1.22.  Use  Exercise  6.1.21  to  simplify  the  following  generalized  functions;  then  write  out  how 
they  act  on  a  suitable  test  function  u(x): 

(a)  (p(x)  =  (x  —  2)  8'(x),  (b)  -0(x)  =  (1  +  sinx)  J(x)  +  J/(x) 


(c)  x(x)  =  °°2  $(x  —  1)  —  S'(x  —  2) 


(d)  c o(x)  =  ex  8  (x  +  1). 

fb 

A  6.1.23.  Prove  that  if  f(x)  is  a  continuous  function,  and  /  f(x)  dx  =  0  for  every  interval  [a,  6], 
then  f(x)  =  0  everywhere.  a 

6.1.24.  Write  out  a  rigorous  proof  that  there  is  no  continuous  function  such  that  the  in¬ 

ner  product  identity  (6.20)  holds  for  every  continuous  function  u(x). 

6.1.25.  True  or  false:  The  sequence  (6.24)  converges  uniformly. 

6.1.26.  True  or  false:  ||  S  ||  =  1. 


The  Fourier  Series  of  the  Delta  Function 


Let  us  next  investigate  the  capability  of  Fourier  series  to  represent  generalized  functions. 
We  begin  with  the  delta  function  5(x),  based  at  the  origin.  Using  the  characterizing 
properties  (6.16),  its  real  Fourier  coefficients  are 

i  r  ii  i  r  i 

ak  =  —  /  5(x)  cos  kx  dx  =  —  cos  kO  =  —  ,  bk  =  —  I  8(x)  sin  kx  dx  =  —  sin  kO  =  0. 

7T  ./-7T  7T  7T  7T  ./_7r  7 T 

(6.35) 
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Therefore,  at  least  on  a  formal  level,  its  Fourier  series  is 


5{x)  ~ 


i  , 

—  (  cos  x  +  cos  2  x  +  cos  3  x  + 
7 r 


(6.36) 


Since  5{x)  =  S(—  x)  is  an  even  function  (why?),  it  should  come  as  no  surprise  that  it  has 
a  cosine  series.  Alternatively,  we  can  rewrite  the  series  in  complex  form 


5(x) 


1 

2tt 


oo  .. 

E  ikx  _  _  / 

2tt 


+  e~2ix  +  e~lx  +  l  +  elx  +eZlx  +  •••),  (6.37) 


1  x  i  XI  i  x 


k  =  — oo 


where  the  complex  Fourier  coefficients  are  computed^  as 


1 

27 r 


Remark :  Although  we  stated  that  the  Fourier  series  (6.36)  represents  the  delta  func¬ 
tion,  this  is  not  entirely  correct.  Remember  that  a  Fourier  series  converges  to  the  2 tt- 
periodic  extension  of  the  original  function.  Therefore,  (6.37)  actually  represents  the  peri¬ 
odic  extension  of  the  delta  function,  sometimes  called  the  Dirac  comb , 


5(x)  =  •  •  •  +5(x  +  47r) +  5(x  +  27r) +  5(x) +5(x  —  2tt)  +  5(x  —  4:7r)  +  5(x  —  67t)+  •  •  •  ,  (6.38) 
consisting  of  a  periodic  array  of  unit  impulses  concentrated  at  all  integer  multiples  of  27T. 

Let  us  investigate  in  what  sense  (if  any)  the  Fourier  series  (6.36)  or,  equivalently, 
(6.37),  represents  the  delta  function.  The  first  observation  is  that,  because  its  summands 
do  not  tend  to  zero,  the  series  certainly  doesn’t  converge  in  the  usual,  calculus,  sense. 
Nevertheless,  in  a  “weak”  sense,  the  series  can  be  regarded  as  converging  to  the  (periodic 
extension  of  the)  delta  function. 

To  understand  the  convergence  mechanism,  we  recall  that  we  already  established  a 
formula  (3.129)  for  the  partial  sums: 


n 

^  cos  kx 
k=  1 


l  sin  [n  +  ^)  x 
2  7T  sin  |  x 


(6.39) 


Graphs  of  some  of  the  partial  sums  on  the  interval  [  —  tt,  tt ]  are  displayed  in  Figure  6.7. 
Note  that,  as  n  increases,  the  spike  at  x  =  0  becomes  progressively  taller  and  thinner, 
converging  to  an  infinitely  tall  delta  spike.  (We  had  to  truncate  the  last  two  graphs;  the 
spike  extends  beyond  the  top.)  Indeed,  by  l’Hopital’s  Rule, 


lim 

x— 2  7 r 


l  sin 


(»  +  !) 


x  l  (n  +  |)  cos  (n  +  |)  x 

—  =  lim 


-|  JLJLJLJLJL  -i  -i 

sin^x  2tt  ^cos-^x 


n+2 
7 r 


-»  oo  as  n  -x  oo. 


(An  elementary  proof  of  this  formula  is  to  note  that,  at  x  =  0,  every  term  in  the  original 
sum  (6.36)  is  equal  to  1.)  Furthermore,  the  integrals  remain  fixed, 


1 


2tt 


*7 r 


-TT 


sn(x )  dx 


1 


27 r 


*7 r 


-TT 


sin 


(n  +  h 


X  l 

—  dx  — 


sm  f  x 


27 r 


*7 r 


-TT 


n 


eikxdx  =  l 


(6.40) 


k  =  —n 


Or  we  could  use  (3.66). 
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Figure  6.7.  Partial  Fourier  sums  approximating  the  delta  function. 


as  required  for  convergence  to  the  delta  function.  However,  away  from  the  spike,  the  partial 
sums  do  not  go  to  zero!  Rather,  they  oscillate  ever  more  rapidly,  while  maintaining  a  fixed 
overall  amplitude  of 

1  1 

—  esc  \  x  —  - .  (6.41) 

9  7T  2  o  •  1  V  J 

z  2  7T  sm  |  x 

As  n  increases,  the  amplitude  function  (6.41)  can  be  seen,  as  in  Figure  6.7,  as  the  envelope 
of  the  increasingly  rapid  oscillations.  So,  roughly  speaking,  the  convergence  sn(x)  -E  5(x) 
means  that  the  “infinitely  fast”  oscillations  are  somehow  canceling  each  other  out,  and  the 
net  effect  is  zero  away  from  the  spike  at  x  =  0.  So  the  convergence  of  the  Fourier  sums  to 
S(x)  is  much  more  subtle  than  in  the  original  limiting  definition  (6.10). 

The  technical  term  is  weak  convergence ,  which  plays  a  very  important  role  in  advanced 
mathematical  analysis,  signal  processing,  composite  materials,  and  elsewhere. 


Definition  6.5.  A  sequence  of  functions  fn(x)  is  said  to  converge  weakly  to  f+(x) 
on  an  interval  [a,  b]  if  their  L2  inner  products  with  every  continuous  test  function  u{x)  E 
C°[a,  6]  converge: 


rb  nb 

/  fn(x)u(x)dx  — >  /  fic(x)u(x)dx  as  n — >  oo.  (6.42) 

J  a  J  a 

Weak  convergence  is  often  indicated  by  a  half-pointed  arrow:  fn  — ^  g. 

Remark :  On  unbounded  intervals,  one  usually  restricts  the  test  functions  to  have 
compact  support ,  meaning  that  u{x)  =  0  for  all  sufficiently  large  \  x\  0.  One  can  also 
restrict  to  smooth  test  functions  only,  e.g.,  require  that  u  E  C°°[a,  &]. 


6.1  Generalized  Functions 


231 


Example  6.6.  Let  us  show  that  the  trigonometric  functions  fn(x)  =  cos nx  converge 
weakly  to  the  zero  function: 


cos  nx 


0  as  n  — >  oo  on  the  interval 


—  7T,  7T 


(Actually,  this  holds  on  any  interval;  see  Exercise  6.1.38.)  According  to  the  definition,  we 
need  to  prove  that 

/7 : 

u(x)  cos  nx  dx  —  0 

-7T 

for  any  continuous  function  u  E  C°[  — 7r,  7 r].  But  this  is  just  a  restatement  of  the  Riemann- 
Lebesgue  Lemma  3.40,  which  says  that  the  high-frequency  Fourier  coefficients  of  a  continu¬ 
ous  (indeed,  even  square-integrable)  function  u(x)  go  to  zero.  The  same  remark  establishes 
the  weak  convergence  sinnx  — ^  0. 

Observe  that  the  functions  cos  nx  fail  to  converge  pointwise  to  0  at  any  value  of  x. 
Indeed,  if  x  is  an  integer  multiple  of  27r,  then  cosnx  =  1  for  all  n.  If  x  is  any  other 
rational  multiple  of  tt,  the  values  of  cos  nx  periodically  cycle  through  a  finite  number  of 
different  values,  and  never  go  to  0,  while  if  x  is  an  irrational  multiple  of  tt,  they  oscillate 
aperiodically  between  —1  and  +1.  The  functions  also  fail  to  converge  in  norm  to  0,  since 
their  (unsealed)  L2  norms  remain  fixed  at 


cos  nx 


cos  2nxdx  = 


for  all 


n  >  0. 


The  cancellation  of  oscillations  in  the  high-frequency  limit  is  a  characteristic  feature  of 
weak  convergence. 

Let  us  now  explain  why,  although  the  Fourier  series  (6.36)  does  not  converge  to  the 
delta  function  either  pointwise  or  in  norm  (indeed,  \\S\\  is  not  even  defined!),  it  does 
converge  weakly  on  [  —  tt  ,  tt  ] .  More  specifically,  we  need  to  prove  that  the  partial  sums 


n 


5,  meaning  that 


/TT  f‘  7V 

s n(x)  u(x)  dx  =  /  8 (x)  u(x)  dx  =  u( 0) 

-TT  J-7T 


(6.43) 


for  every  sufficiently  nice  function  u,  or,  equivalently, 


*7 r 


lim  — 

n-foo  2  7T 


sin  [n  +  |)  x 

u{x)  - 7 — t - dx  =  u( 0). 


TT 


sm  2  x 


(6.44) 


But  this  is  a  restatement  of  a  special  case  of  the  identities  (3.130)  used  in  the  proof  of 
the  Pointwise  Convergence  Theorem  3.8  for  the  Fourier  series  of  a  (piecewise)  C1  function. 
Indeed,  summing  the  two  identities  in  (3.130)  and  then  setting  x  =  0  reproduces  (6.44), 
since,  by  continuity,  u(0)  =  \  u(0+)  +  u(0-)  ] .  In  other  words,  the  pointwise  convergence 

of  the  Fourier  series  of  a  C1  function  is  equivalent  to  the  weak  convergence^  of  the  Fourier 
series  of  the  delta  function! 


^  Definition  6.5  only  requires  continuity  of  the  test  functions,  whereas  in  (6.44)  they  need  to 

be  C1,  so  the  notion  of  weak  convergence  here  is  slightly  slightly  more  refined.  One  often  restricts 
further  to  allow  only  C°°  test  functions. 
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Example  6.7.  If  we  differentiate  the  Fourier  series 


oo 


X  ~ 


k  =  1 


(— l)k  1  (  sm2x  sm3x 

- - - smo  =  2  (  smx  —  -  + 

k 


sin4x 


+ 


we  obtain  an  apparent  contradiction: 


oo 

1  ~  2  (— l)fc+1  cos kx  =  2 cosx  —  2 cos2x  +  2 cos3x  —  2 cos4x  +  •••  .  (6.45) 

fc  =  i 

But  the  Fourier  series  for  1  consists  of  just  a  single  constant  term!  (Why?) 

The  resolution  of  this  paradox  is  not  difficult.  The  Fourier  series  (3.37)  does  not 
converge  to  x,  but  rather  to  its  27r-periodic  extension  /(#),  which  has  jump  discontinuities 
of  magnitude  2n  at  odd  multiples  of  7 r;  see  Figure  3.1.  Thus,  Theorem  3.22  is  not  directly 
applicable.  Nevertheless,  we  can  assign  a  consistent  interpretation  to  the  differentiated 
series.  The  derivative  f'[x )  of  the  periodic  extension  is  not  equal  to  the  constant  function 
1,  but  rather  has  an  additional  delta  function  concentrated  at  each  jump  discontinuity: 


oo 

f'(x )  =  1  —  2 7 r  5(x  —  (2 j  +  1) 7T )  =  1  —  2tt5(x  —  i r), 

3  =  -oo 


where  5  denotes  the  27r-periodic  extension  of  the  delta  function,  cf.  (6.38).  The  differenti¬ 
ated  Fourier  series  (6.45)  does,  in  fact,  represent  f'(x).  Indeed,  the  Fourier  coefficients  of 
5(x  —  7r)  are 


ak  — 
bk  = 


1  [27T  1  ( 

—  /  S  (x  —  7T )  cos  k  x  dx  =  —  cos  krr  =  — 

n  Jo  ™ 

1  [27T  1 

—  /  S(x  —  7r)  sin  kx  dx  =  —  sin  kir  =  0. 

^  Jo  n 


k 


7 T 


Observe  that  we  changed  the  interval  of  integration  to  [0,2tt 
function  singularities  at  the  endpoints.  Thus, 


to  avoid  placing  the  delta 


5{x  —  7r)  ^  —  +  —  (  —  cosx  +  cos2x  —  cos3x  +  •••  ),  (6.46) 

2tt  7 r  v  y 

which  serves  to  resolve  the  contradiction. 


Example  6.8.  Let  us  differentiate  the  Fourier  series 


a(x) 


sin  3  x  sin  5  x  sin  7  x 

smx  +  ~r  +  ~ir  +  ~T- + 


for  the  unit  step  function  we  found  in  Example  3.9  and  see  whether  we  end  up  with  the 
Fourier  series  (6.36)  for  the  delta  function.  We  compute 


da 

- 

dx 


A  , 

—  (  cos  x  +  cos  3  x  +  cos  5  x  +  cos  7  x  + 

7T  V 


(6.47) 


which  does  not  agree  with  (6.36)  —  half  the  terms  are  missing!  The  explanation  is  similar 
to  the  preceding  example:  the  27r-periodic  extension  d{x)  of  the  step  function  has  two 
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jump  discontinuities,  of  magnitudes  +1  at  even  multiples  of  n  and  —1  at  odd  multiples; 
see  Figure  3.6.  Therefore,  its  derivative 

— —  =  5(x)  —  5{x  —  7T ) 
ax 

is  the  difference  of  the  27r-periodic  extension  of  the  delta  function  at  0,  with  Fourier  series 
(6.36),  minus  the  27r-periodic  extension  of  the  delta  function  at  7r,  with  Fourier  series 
(6.46),  which  produces  (6.47). 

It  is  a  remarkable,  profound  fact  that  Fourier  analysis  is  entirely  compatible  with  the 
calculus  of  generalized  functions,  [68].  For  instance,  term- wise  differentiation  of  the  Fourier 
series  for  a  piecewise  C1  function  leads  to  the  Fourier  series  for  the  differentiated  function 
that  incorporates  delta  functions  of  the  appropriate  magnitude  at  each  jump  discontinuity. 
This  fact  further  reassures  us  that  the  rather  mysterious  construction  of  delta  functions 
and  their  generalizations  is  indeed  the  right  way  to  extend  calculus  to  functions  that  do 
not  possess  derivatives  in  the  ordinary  sense. 


Exercises 

6.1.27.  Determine  the  real  and  complex  Fourier  series  for  6{x  —  £),  where  —  tt  <  £  <  7r.  What 
periodic  generalized  function(s)  do  they  represent? 

6.1.28.  Determine  the  Fourier  sine  series  and  the  Fourier  cosine  series  for  5{x  —  £),  where 
0  <  £  <  tt.  Which  periodic  generalized  functions  do  they  represent? 

T  6.1.29.  Let  n  >  0  be  a  positive  integer,  (a)  For  integers  0  <  j  <  n,  find  the  complex  Fourier 
series  of  the  2  7r-periodically  extended  delta  functions  5j(x)  =  8{x  —  2j  ir/n).  (b)  Prove  that 
their  Fourier  coefficients  satisfy  the  periodicity  condition  ck  =  Cj  whenever  k  =  l  mod  n. 
(c)  Conversely,  given  complex  Fourier  coefficients  that  satisfy  the  periodicity  condition 
ck  =  cx  whenever  k  =  l  mod  n,  prove  that  the  corresponding  Fourier  series  represents  a  lin¬ 
ear  combination  of  the  preceding  periodically  extended  delta  functions  50(x), . . .  1Sn_1(x). 
Hint :  Use  Example  B.22.  (d)  Prove  that  a  complex  Fourier  series  represents  a  27r-periodic 
function  that  is  constant  on  the  subintervals  2ir  j /n  <  x  <  2i r(j  +  l)/n,  for  j  E  Z,  if  and 
only  if  its  Fourier  coefficients  satisfy  the  conditions 

k  ck  =  l  Ci,  k  =  1^0  mod  n,  ck  =  0,  0  /  /c  =  0  mod  n. 

X  6.1.30.  (a)  Find  the  complex  Fourier  series  for  the  derivative  of  the  delta  function  (5r(x)  by  di¬ 
rect  evaluation  of  the  coefficient  formulas,  (b)  Verify  that  your  series  can  be  obtained  by 
term-by-term  differentiation  of  the  series  for  <5(x).  (c)  Write  a  formula  for  the  nth  partial 
sum  of  your  series,  (d)  Use  a  computer  graphics  package  to  investigate  the  convergence  of 
the  series. 

6.1.31.  What  is  the  Fourier  series  for  the  generalized  function  g(x)  =  xS(x)?  Can  you  obtain 
this  result  through  multiplication  of  the  individual  Fourier  series  (3.37),  (6.37)? 

6.1.32.  Apply  the  method  of  Exercise  3.2.59  to  find  the  complex  Fourier  series  for  the  function 
f(x)  =  (5(x)  elx.  Which  Fourier  series  do  you  get?  Can  you  explain  what  is  going  on? 

6.1.33.  In  Exercise  6.1.12  we  established  the  identity  £(#)  =  2S(2x).  Does  this  hold  on  the 
level  of  Fourier  series?  Can  you  explain  why  or  why  not? 

6.1.34.  How  should  one  interpret  the  formula  (6.38)  for  the  periodic  extension  of  the  delta 
function  (a)  as  a  limit?  ( b )  as  a  linear  functional? 
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6.1.35.  Write  down  the  complex  Fourier  series  for  ex .  Differentiate  term  by  term.  Do  you  get 
the  same  series?  Explain  your  answer. 

6.1.36.  True  or  false:  If  you  integrate  the  Fourier  series  for  the  delta  function  4(x)  term  by 
term,  you  obtain  the  Fourier  series  for  the  step  function  a(x). 

6.1.37.  Find  the  Fourier  series  for  the  function  S(x)  on  the  interval  —1  <  x  <  1.  Which  (gener¬ 
alized)  function  does  the  Fourier  series  represent? 

0  6.1.38..  Prove  that  cosnx  — ^  0  (weakly)  as  n  ^  oo  on  any  bounded  interval  [a,  b]. 

6.1.39.  Prove  that  if  un  — u  in  norm,  then  un  u  weakly. 


6.1.40.  True  or  false:  (a)  If  un  u  uniformly  on  [a,  6],  then  un  — ^  u  weakly. 

(0  if  un(x)  u{x)  pointwise,  then  un  — ^  u  weakly. 


6.1.41.  Prove  that  the  sequence  fn(x) 
limiting  function? 


cos2  nx  converges  weakly  on  [  —  tt,  tt ] .  What  is  the 


Q 

6.1.42.  Answer  Exercise  6.1.41  when  fn(x)  =  cos  nx. 

6.1.43.  Discuss  the  weak  convergence  of  the  Fourier  series  for  the  derivative  S'(x)  of  the  delta 
function. 


6.2  Green’s  Functions  for 

One— Dimensional  Boundary  Value  Problems 

We  will  now  put  the  delta  function  to  work  by  developing  a  general  method  for  solving 
inhomogeneous  linear  boundary  value  problems.  The  key  idea,  motivated  by  the  linear 
algebra  technique  outlined  at  the  beginning  of  the  previous  section,  is  to  first  solve  the  sys¬ 
tem  when  subject  to  a  unit  delta  function  impulse,  which  produces  the  Green’s  function. 
We  then  apply  linear  superposition  to  write  down  the  solution  for  a  general  forcing  inho¬ 
mogeneity.  The  Green’s  function  approach  has  wide  applicability,  but  will  be  developed 
here  in  the  context  of  a  few  basic  examples. 

Example  6.9.  The  boundary  value  problem 

—  cu"  =  f(x ),  'u(O)  =  0  =  u(l),  (6.48) 

models  the  longitudinal  deformation  u(x)  of  a  homogeneous  elastic  bar  of  unit  length  and 
constant  stiffness  c  that  is  fixed  at  both  ends  while  subject  to  an  external  force  f(x).  The 
associated  Green’s  function  refers  to  the  family  of  solutions 

u[pc)  =  G^(x)  =  G[x ;£) 

induced  by  unit-impulse  forces  concentrated  at  a  single  point  0  <  £  <  1: 

—  cu"  =  5{x  —  £),  n(0)  =  0  =  u(l).  (6.49) 

The  solution  to  the  differential  equation  can  be  straightforwardly  obtained  by  direct  inte¬ 
gration.  First,  by  (6.23), 
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where  a  is  a  constant  of  integration.  A  second  integration  leads  to 


u(x)  =  —  — - —  +  a  x  +  6,  (6.50) 

c 

where  p  is  the  ramp  function  (6.25).  The  integration  constants  a,  6  are  fixed  by  the  bound¬ 
ary  conditions;  since  0  <  £  <  1,  we  have 


'u(O)  =  6  =  0,  u(  1)  =  — 


1-f 


a  -\-  b  —  0. 


and  so 


c 


a  = 


1-f 


6  =  0. 


c 


We  deduce  that  the  Green’s  function  for  the  problem  is 


c(  f  (!  -Qx/c,  x  <  i, 

c  l  C(1  -  x)/c,  x  >  £. 


(6.51) 


As  sketched  in  Figure  6.8,  for  each  fixed  £,  the  function  G^(x)  =  G?(x;£)  depends  continu¬ 
ously  on  x ;  its  graph  consists  of  two  connected  straight  line  segments,  with  a  corner  at  the 
point  of  application  of  the  unit  impulse  force. 

Once  we  have  determined  the  Green’s  function,  we  are  able  to  solve  the  general  in¬ 
homogeneous  boundary  value  problem  (6.48)  by  linear  superposition.  We  first  express  the 
forcing  function  f(x)  as  a  linear  combination  of  impulses  concentrated  at  various  points 
along  the  bar.  Since  there  is  a  continuum  of  possible  positions  0  <  £  <  1  at  which  impulse 
forces  may  be  applied,  we  will  use  an  integral  to  sum  them,  thereby  writing  the  external 
force  as 

fix)  =  [  5(x  -  £)  f(£)d£.  (6.52) 

Jo 


We  can  interpret  (6.52)  as  the  (continuous)  superposition  of  an  infinite  collection  of  im¬ 
pulses,  namely  /(£)  5{x  —  £),  of  magnitude  /(£)  and  concentrated  at  position  £. 

The  Superposition  Principle  states  that  linear  combinations  of  inhomogeneities  pro¬ 
duce  the  selfsame  linear  combinations  of  solutions.  Again,  we  adapt  this  principle  to  the 
continuum  by  replacing  the  sums  by  integrals.  Thus,  the  solution  to  the  boundary  value 
problem  will  be  the  linear  superposition 


u{x)  =  f  G(x;£)  f(£)d£ 

Jo 

of  the  Green’s  function  solutions  to  the  individual  unit-impulse  problems. 


(6.53) 
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For  the  particular  boundary  value  problem  (6.48),  we  use  the  formula  (6.51)  for  the 
Green’s  function.  Breaking  the  resulting  integral  (6.53)  into  two  parts,  over  the  subintervals 
0<^<x  and  x  <  £  <  1,  we  arrive  at  the  explicit  solution  formula 

u(x)  =  -  [  (1  -x)£f(Qd£  +  -  [  aK1 -£)/(£)<*£•  (6-54) 

c  JO  C  Jx 

For  example,  under  a  constant  unit  force  /,  (6.54)  yields  the  solution 


u(x)  =  —  /  (1  —  x)  dt;  +  —  f  X  (l  —  dt;  =  J—  (l  —  x)  x2  +  X  (l  —  x)2  =  J—  (x  —  x2) . 

c  J o  c  Jx  2 c  2 c  2c 

Let  ns,  finally,  convince  ourselves  that  the  superposition  formula  (6.54)  indeed  gives  the 
correct  answer.  First, 


c 


du 

dx 


(1  -x)xf(x)+  {-ifX))d^-x{l-x)f{x)+  i  (1 -0/(0 

Jo  Jx 


da 


1  1 
-  /  af(0da+  /  /( o 

4  o  J  x 


da. 


Differentiating  again  with  respect  to  x,  we  see  that  the  hrst  term  is  constant,  and  so 
d2u 


—  c 


dx2 


/(x),  as  claimed. 


Remark :  In  computing  the  derivatives  of  n,  we  made  use  of  the  calculus  formula 


d 

dx 


r(3  O) 

'  F(x,a)da  =  F(x,/3(x)) 

a(x) 


dfJ 

dx 


F(x,  a(x)) 


rP(x)  Qp 
Ol{x) 


(x,€)  d£ 


(6.55) 


for  the  derivative  of  an  integral  with  variable  limits  —  which  is  a  straightforward  conse¬ 
quence  of  the  Fundamental  Theorem  of  Calculus  and  the  chain  rule,  [8,  108].  As  always, 
one  must  exercise  due  care  when  interchanging  differentiation  and  integration. 


We  note  the  following  basic  properties,  which  serve  to  uniquely  characterize  the  Green’s 
function.  First,  since  the  delta  forcing  vanishes  except  at  the  point  x  =  £,  the  Green’s 
function  satisfies  the  homogeneous  differential  equation^ 


—  c 


d2G 
dx 2 


(x;C  =  0 


for  all 


x  7^  £• 


(6.56) 


Second,  by  construction,  it  must  satisfy  the  boundary  conditions 


G(0;O  =  0  =  G(l;O. 

Third,  for  each  fixed  £,  G(x;£)  is  a  continuous  function  of  x,  but  its  derivative  dG/dx 
has  a  jump  discontinuity  of  magnitude  —  1/c  at  the  impulse  point  x  =  £.  As  a  result,  the 
second  derivative  d2G/dx2  has  a  delta  function  discontinuity  there,  and  hence  solves  the 
original  impulse  boundary  value  problem  (6.49). 

Finally,  we  cannot  help  but  notice  that  the  Green’s  function  (6.51)  is  a  symmetric 
function  of  its  two  arguments:  G(x;£)  =  G(£;x).  Symmetry  has  the  interesting  physi¬ 
cal  consequence  that  the  displacement  of  the  bar  at  position  x  due  to  an  impulse  force 


^  Since  G(x;£)  is  a  function  of  two  variables,  we  switch  to  partial  derivative  notation  to  indicate 
its  derivatives. 
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concentrated  at  position  £  is  exactly  the  same  as  the  displacement  of  the  bar  at  £  due 
to  an  impulse  of  the  same  magnitude  being  applied  at  x.  This  turns  out  to  be  a  rather 
general,  although  perhaps  unanticipated,  phenomenon.  Symmetry  of  the  Green’s  function 
is  a  consequence  of  the  underlying  symmetry,  or,  more  accurately,  “self-adjointness” ,  of 
the  boundary  value  problem,  a  topic  that  will  be  developed  in  detail  in  Section  9.2. 

Example  6.10.  Let  cc2  >  0  be  a  fixed  positive  constant.  Let  us  solve  the  inhomoge¬ 
neous  boundary  value  problem 

—  u"  +  uj2  u  =  /(x),  'm(O)  =  u(l)  =  0,  (6.57) 

by  constructing  its  Green’s  function.  To  this  end,  we  first  analyze  the  effect  of  a  delta 
function  inhomogeneity 


—  u"  +  uj2  u  =  5(x  —  £),  n(0)  =  u(l)  =  0.  (6.58) 

Rather  than  try  to  integrate  this  differential  equation  directly,  let  us  appeal  to  the  defining 
properties  of  the  Green’s  function.  The  general  solution  to  the  homogeneous  equation  is  a 
linear  combination  of  the  two  basic  exponentials  eux  and  e~ux ,  or  better,  the  hyperbolic 
functions 

pU)  X  I  p—  UJ  X  pUJ  X  _  p  —  UJ  X 

coshccx  =  - ,  sinhccx  =  - .  (6.59) 

2  2 

The  solutions  satisfying  the  first  boundary  condition  are  multiples  of  sinhccx,  while  those 
satisfying  the  second  boundary  condition  are  multiples  of  sinhca(l  —  x).  Therefore,  the 
solution  to  (6.58)  has  the  form 


G(*;0 


a  sinhccx, 
b  sinhcc  (1  —  x), 


Continuity  of  G(x;  £)  at  x  =  £  requires 


x  < 
x>€- 


a  sinhcc^  =  b  sinhca  (1  —  £).  (6.60) 

At  x  =  £,  the  derivative  dG/dx  must  have  a  jump  discontinuity  of  magnitude  —1  in  order 
that  the  second  derivative  term  in  (6.58)  match  the  delta  function.  (The  uj2  u  term  clearly 
cannot  produce  the  required  singularity.)  Since 

dG  f  acjcoshcux,  x  <  £, 

dx  v  ’  \  —  buj  coshce  (1  —  x),  x  >  £, 

the  jump  condition  requires 


a  uj  cosh  uj{;  —  1  =  —  buj  cosh  uj  ( 1  —  £) 


(6.61) 


Multiplying  (6.60)  by  uj  coshce(l  —  £)  and  (6.61)  by  sinhcc(l  —  £),  and  then  adding  the 
results  together,  we  obtain 

sinhce  (1  —  £)  =  auj  sinhcc^  cosh  uj  (1  —  £)  +  cosh  uj^  sinhcc  (1  —  ^)  =acesinhcc,  (6.62) 
where  we  made  use  of  the  addition  formula  for  the  hyperbolic  sine: 


sinh(<a  +  /3)  =  sinh  a  cosh  (3  +  cosh  a  sinh  f3. 


(6.63) 


238 


6  Generalized  Functions  and  Green’s  Functions 


Figure  6.9.  Green’s  function  for  the  boundary  value  problem  (6.57). 


which  you  are  asked  to  prove  in  Exercise  6.2.13.  Therefore,  solving  (6.61-62)  for 


a 


sinhce  (1  —  £) 


uj  sinh  uj 


b  = 


sinh  uj  £ 
uj  sinh  uj 


produces  the  explicit  formula 


G(x-0  = 


sinh  uj  x  sinh  uj  (1  —  £) 
uj  sinh  uj 

sinhce(l  —  x)  sinhce^ 


X  < 


X>£. 


(6.64) 


uj  sinh  uj 

A  representative  graph  appears  in  Figure  6.9.  As  before,  a  corner,  indicating  a  discontinuity 
in  the  first  derivative,  appears  at  the  point  x  =  £  where  the  impulse  force  is  applied. 
Moreover,  as  in  the  previous  example,  G(x]  £)  =  C?(£;  x)  is  a  symmetric  function. 

The  general  solution  to  the  inhomogeneous  boundary  value  problem  (6.57)  is  then 
given  by  the  superposition  formula  (6.53);  explicitly, 


u(x)  = 

Jo 


G(x-,0f(0dti 

sinh  uj  ( 1  —  x)  sinh  uj  £ 


/(£)#  + 


4  sinhwx  sinhw(l  —  £) 


(6.65) 


X 


uj  sinh  uj 


o  cesinhce 

For  example,  under  a  constant  unit  force  f(x)  =  1,  the  solution  is 


m  rfe 


u{x)  = 

Jo 


X 


sinh  uj  (1  —  x)  sinh  ujy  1J=  f1  sinhcex  sinhce  (1  —  0 

- " — rn - - -df 

0  cesinhce  K  cesmhce 


X 


sinh  uj  (1  —  x)  ( cosh  ujx  —  1 )  sinh  cosh  uj  (1  —  x)  —  1 ) 


uj 2  sinh  uj 
sinh  ujx  sinh  uj  ( 1  —  x) 


uj2  sinh  uj 


uj2  ce2sinhce 


For  comparative  purposes,  the  reader  may  wish  to  rederive  this  particular  solution  by  a 
direct  calculation,  without  appealing  to  the  Green’s  function. 

Example  6.11.  Finally,  consider  the  Neumann  boundary  value  problem 


—  cu 


u 


/(*) 


u'(  0)  =  0  =  ti'(l), 


(6.66) 
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modeling  the  equilibrium  deformation  of  a  homogeneous  bar  with  two  free  ends  when 
subject  to  an  external  force  f(x).  The  Green’s  function  should  satisfy  the  particular  case 


—  cu 


n 


Hx~0 


u'(  0)  =  0  =  u'(l) 


when  the  forcing  function  is  a  concentrated  impulse.  As  in  Example  6.9,  the  general 
solution  to  the  latter  differential  equation  is 

p(x-£) 

u(x )  = - b  ax  -h  b, 

c 

where  a,  b  are  integration  constants,  and  p  is  the  ramp  function  (6.25).  However,  the 
Neumann  boundary  conditions  require  that 


u/(0)  =  a  =  0, 


u'(  1)  = - b  a  =  0. 

7  c 


which  cannot  both  be  satisfied.  We  conclude  that  there  is  no  Green’s  function  in  this  case. 

The  difficulty  is  that  the  Neumann  boundary  value  problem  (6.66)  does  not  have 
a  unique  solution,  and  hence  cannot  admit  a  Green’s  function  solution  formula  (6.53). 
Indeed,  integrating  twice,  we  find  that  the  general  solution  to  the  differential  equation  is 


^  rx  rv 

u{x)  =  ax-\-b - /  /  f(z)dzdy 

c  Jo  Jo 


where  a,  b  are  integration  constants.  Since 


1  fx 

u'(x )  =  a - J  f(z) 


dz . 


the  boundary  conditions  require  that 


1  f1 

u'( 0)  =  a  =  0,  A ( 1 )  —  a - /  f(z)  dz  =  0. 

c  Jo 


These  equations  are  compatible  if  and  only  if 


f  f(z)  dz  =  0. 

Jo 


(6.67) 


Thus,  the  Neumann  boundary  value  problem  admits  a  solution  if  and  only  if  there  is  no 
net  force  on  the  bar.  Indeed,  physically,  if  (6.67)  does  not  hold,  then,  because  its  ends  are 
not  attached  to  any  support,  the  bar  cannot  stay  in  equilibrium,  but  will  move  off  in  the 
direction  of  the  net  force.  On  the  other  hand,  if  (6.67)  holds,  then  the  solution 


l  rx  ry 

u(x)  =  b  —  -  /  f(z)dzdy 

c  Jo  Jo 


is  not  unique ,  since  b  is  not  constrained  by  the  boundary  conditions,  and  so  can  assume 
any  constant  value.  Physically,  this  means  that  any  equilibrium  configuration  of  the  bar 
can  be  freely  translated  to  assume  another  valid  equilibrium. 


Remark :  The  constraint  (6.67)  is  a  manifestation  of  the  Fredholm  Alternative ,  to  be 
developed  in  detail  in  Section  9.1. 
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Let  us  summarize  the  fundamental  properties  that  serve  to  completely  characterize 
the  Green’s  function  of  boundary  value  problems  governed  by  second-order  linear  ordinary 
differential  equations 


d^n  dn 

p(x )  +  q(x)  ~r  +  r(x)  u(x)  =  /(x)’ 


dx 2 


dx 


(6.68) 


combined  with  a  pair  of  homogeneous  boundary  conditions  at  the  ends  of  the  interval 
a,  b].  We  assume  that  the  coefficient  functions  are  continuous,  p,  g,r,  /  E  C°[a,  6],  and 
that  p(x)  7^  0  for  all  a  <  x  <  b. 


Basic  Properties  of  the  Green’s  Function  G(x]£) 

(i)  Solves  the  homogeneous  differential  equation  at  all  points 

( ii )  Satishes  the  homogeneous  boundary  conditions. 

(in)  Is  a  continuous  function  of  its  arguments. 

(iv)  For  each  hxed  £,  its  derivative  dG/dx  is  piecewise  C1,  with  a  single  jump  discontinuity 
of  magnitude  l/p(£)  at  the  impulse  point  x  =  £. 

With  the  Green’s  function  in  hand,  we  deduce  that  the  solution  to  the  general  bound¬ 
ary  value  problem  (6.68)  subject  to  the  appropriate  homogeneous  boundary  conditions  is 
expressed  by  the  Green’s  Function  Superposition  Formula 

u(x)  =  [  G(x;£)  f(€)d£.  (6.69) 

J  a 

The  symmetry  of  the  Green’s  function  is  more  subtle,  for  it  relies  on  the  self-adjointness  of 
the  boundary  value  problem,  an  issue  to  be  addressed  in  detail  in  Chapter  9.  In  the  present 
situation,  self- adjoint  ness  requires  that  q(x)  =  p'(x),  in  which  case  G(£;x)  =  G(x ;£)  will 
be  symmetric  in  its  arguments. 

Finally,  as  we  saw  in  Example  6.11,  not  every  such  boundary  value  problem  admits 
a  solution,  and  one  expects  to  fold  a  Green’s  function  only  in  cases  in  which  the  solution 
exists  and  is  unique. 

Theorem  6.12.  The  following  are  equivalent: 

•  The  only  solution  to  the  homogeneous  boundary  value  problem  is  the  zero  function. 

•  The  inhomogeneous  boundary  value  problem  has  a  unique  solution  for  every  choice  of 

forcing  function. 

•  The  boundary  value  problem  admits  a  Green’s  function. 


Exercises 


6.2.1.  Let  c  >  0.  Find  the  Green’s  function  for  the  boundary  value  problem  —  cu"  =  f(x), 

fz(0)  =0,  u  (1)  =  0,  which  models  the  displacement  of  a  uniform  bar  of  unit  length  with 
one  fixed  and  one  free  end  under  an  external  force.  Then  use  superposition  to  write  down 
a  formula  for  the  solution.  Verify  that  your  integral  formula  is  correct  by  direct  differentia¬ 
tion  and  substitution  into  the  differential  equation  and  boundary  conditions. 
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6.2.2.  A  uniform  bar  of  length  i  —  4  has  constant  stiffness  c  =  2.  Find  the  Green’s  function  for 
the  case  that  (a)  both  ends  are  fixed;  (b)  one  end  is  fixed  and  the  other  is  free,  (c)  Why  is 
there  no  Green’s  function  when  both  ends  are  free? 


6.2.3.  A  point  2  cm  along  a  10  cm  bar  experiences  a  displacement  of  1  mm  under  a  concen¬ 
trated  force  of  2  newtons  applied  at  the  midpoint  of  the  bar.  How  far  does  the  midpoint 
deflect  when  a  concentrated  force  of  1  newton  is  applied  at  the  point  2  cm  along  the  bar? 


C  6.2.4.  The  boundary  value  problem 


d_ 

dx 


c(x) 


du 

dx 


f(x),  m(0) 


u(l)  =  0,  models  the 
1 

for  0  <  x  <  1. 


displacement  u(x)  of  a  nonuniform  elastic  bar  with  stiffness  c(x)  =  -  9 

1  -f-  x 

(a)  Find  the  displacement  when  the  bar  is  subjected  to  a  constant  external  force,  /  =  1. 

(b)  Find  the  Green’s  function  for  the  boundary  value  problem,  (c)  Use  the  resulting  su¬ 
perposition  formula  to  check  your  solution  to  part  (a),  (d)  Which  point  0  <  £  <  1  on  the 

bar  is  the  “weakest”,  i.e.,  the  bar  experiences  the  largest  displacement  under  a  unit  impulse 
concentrated  at  that  point? 


6.2.5.  Answer  Exercise  6.2.4  when  c(x)  =  1  +  x. 

C  6.2.6.  Consider  the  boundary  value  problem  —u"  =  /(#),  'iz(O)  =0,  u(l)  =  2u  (1). 

(a)  Find  the  Green’s  function,  (b)  Which  of  the  fundamental  properties  does  your  Green’s 
function  satisfy?  (c)  Write  down  an  explicit  integral  formula  for  the  solution  to  the  bound¬ 
ary  value  problem,  and  prove  its  validity  by  a  direct  computation,  (d)  Explain  why  the 

related  boundary  value  problem  —  u  =  /,  a(0)  =  0,  a(l)  =  u  (1),  does  not  have  a  Green’s 
function. 


T  6.2.7.  For  n  a  positive  integer,  set  fn(x) 


in, 

0, 


l*-£ I  <  b 

otherwise. 


(a)  Find  the  solution  un(x)  to  the  boundary  value  problem  —u"  =  fn(x),  ?z(0)  =  u{  1)  =  0, 
assuming  0<£  —  —  <£+  —  <1.  (b)  Prove  that  lim  un(x)  =  G(x\$f)  converges  to  the 

n  — >  oo 

Green’s  function  (6.51).  Why  should  this  be  the  case?  (c)  Reconfirm  the  result  in  part  (b) 
by  graphing  u5(x),u15(x),u2 5(x),  along  with  G(x;£)  when  £  =  .3. 


6.2.8.  Solve  the  boundary  value  problem  —  4u"  +  9u  =  0,  'u(O)  =  0,  u (2)  =  1.  Is  your  solution 
unique? 

6.2.9.  True  or  false:  The  Neumann  boundary  value  problem  —  u"  +  u  =  1,  i/(0)  =  u  (1)  =  0, 
has  a  unique  solution. 


6.2.10.  Use  the  Green’s  function  (6.64)  to  solve  the  boundary  value  problem  (6.57)  when  the 


forcing  function  is  f(x) 


1,  0  <  x  <  2, 

—  1,  \  <  x  <  1. 


6.2.11.  Let  uj  >  0.  (a)  Find  the  Green’s  function  for  the  mixed  boundary  value  problem 

—  u"  +  uj2  u  =  /(x),  fz(0)  =  0,  u  (1)  =  0. 


(b)  Use  your  Green’s  function  to  find  the  solution  when  f(x)  — 


1,  0  <  x  <  2? 

1,  \  <  x  <  1. 


//  2 

6.2.12.  Supposes  >  0.  Does  the  Neumann  boundary  value  problem  —  u  +  uo  u  =  /(#), 

u  (0)  =  u  (1)  =  0  admit  a  Green’s  function?  If  not,  explain  why  not.  If  so,  find  it,  and  then 
write  down  an  integral  formula  for  the  solution  of  the  boundary  value  problem. 


6.2.13.  (a)  Prove  the  addition  formula  (6.63)  for  the  hyperbolic  sine  function, 
(b)  Find  the  corresponding  addition  formula  for  the  hyperbolic  cosine. 

6.2.14.  Prove  the  differentiation  formula  (6.55). 
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6.3  Green’s  Functions  for  the  Planar  Poisson  Equation 


Now  we  develop  the  Green’s  function  approach  to  solving  boundary  value  problems  in¬ 
volving  the  two-dimensional  Poisson  equation  (4.84).  As  before,  the  Green’s  function  is 
characterized  as  the  solution  to  the  homogeneous  boundary  value  problem  in  which  the 
inhomogeneity  is  a  concentrated  unit  impulse  —  a  delta  function.  The  solution  to  the 
general  forced  boundary  value  problem  is  then  obtained  via  linear  superposition,  that  is, 
as  a  convolution  integral  with  the  Green’s  function. 

However,  before  proceeding,  we  need  to  quickly  review  some  basic  facts  concerning 
vector  calculus  in  the  plane.  The  student  may  wish  to  consult  a  standard  multivariable 
calculus  text,  e.g.,  [8,  108],  for  additional  details. 


Calculus  in  the  Plane 


Let  x  =  (x,y)  denote  the  usual  Cartesian  coordinates  on  IR2.  The  term  scalar  field  is 
synonymous  with  a  real- valued  function  u(x,y),  defined  on  a  domain  12  C  M2.  A  vector¬ 
valued  function 

v(x)  =  v(*,»)  =(”■£»>)  (6.70) 

is  known  as  a  (planar)  vector  field.  A  vector  held  assigns  a  vector  v(x,  y)  £  M2  to  each  point 
(x,y)  £  12  in  its  domain  of  definition,  and  hence  defines  a  function  v:  12  -£  M2.  Physical 
examples  include  velocity  vector  fields  of  fluid  hows,  heat  hux  helds  in  thermodynamics, 
and  gravitational  and  electrostatic  force  helds. 

The  gradient  operator  V  maps  a  scalar  held  u(x,y)  to  the  vector  held 


\7u  = 


(  du/dx 
y du/dy 


(6.71) 


The  scalar  held  u  is  often  referred  to  as  a  potential  function  for  its  gradient  vector  held 
v  =  Vu.  On  a  connected  domain  12,  the  potential,  when  it  exists,  is  uniquely  determined 
up  to  addition  of  a  constant. 

T 

The  divergence  of  the  planar  vector  held  v  =  ( v1 ,  v2  )  is  the  scalar  held 


Its  curl  is  dehned  as 


V  •  v  =  div  v  = 


V  x  v  =  curl  v  = 


dv1  dv 2 
dx  dy 

(6.72) 

dv 2  dv1 
dx  dy 

(6.73) 

Notice  that  the  curl  of  a  planar  vector  held  is  a  scalar  held.  (In  contrast,  in  three  dimen¬ 
sions,  the  curl  of  a  vector  held  is  another  vector  held.)  Given  a  smooth  potential  u  £  C2, 
the  curl  of  its  gradient  vector  held  automatically  vanishes: 


V  x  \7u  = 


d  du 
dx  dy 


d  du 
dy  dx 


by  the  equality  of  mixed  partials.  Thus,  a  necessary  condition  for  a  vector  held  v  to  admit 
a  potential  is  that  it  be  irrotational ,  meaning  V  x  v  =  0;  this  condition  is  sufficient  if 
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Figure  6.10. 


Orientation  of  the  boundary  of  a  planar  domain. 


the  underlying  domain  12  is  simply  connected ,  i.e.,  has  no  holes.  On  the  other  hand,  the 
divergence  of  a  gradient  vector  field  coincides  with  the  Laplacian  of  the  potential  function: 


V  •  \7u  =  A u 


d2u  d2u 
dx 2  dy 2 


(6.74) 


A  vector  field  is  incompressible  if  it  has  zero  divergence:  V  •  v  =  0;  for  the  velocity  vector 
field  of  a  steady-state  fluid  flow,  incompressibility  means  that  the  fluid  does  not  change 
volume.  (Water  is,  for  all  practical  purposes,  an  incompressible  fluid.)  Therefore,  an 
irrotational  vector  field  with  potential  u  is  also  incompressible  if  and  only  if  the  potential 
solves  the  Laplace  equation  A u  =  0. 


Remark :  Because  of  formula  (6.74),  the  Laplacian  operator  is  also  sometimes  written 
as  A  =  V2.  The  factorization  of  the  Laplacian  into  the  product  of  the  divergence  and  the 
gradient  operators  is,  in  fact,  of  great  importance,  and  underlies  its  “self-adjointness”,  a 
fundamental  property  whose  ramifications  will  be  explored  in  depth  in  Chapter  9. 


Let  12  C  IR2  be  a  bounded  domain  whose  boundary  <912  consists  of  one  or  more  piecewise 
smooth  closed  curves.  We  orient  the  boundary  so  that  the  domain  is  always  on  one’s  left 
as  one  goes  around  the  boundary  curve(s).  Figure  6.10  sketches  a  domain  with  two  holes; 
its  three  boundary  curves  are  oriented  according  to  the  directions  of  the  arrows.  Note  that 
the  outer  boundary  curve  is  traversed  in  a  counterclockwise  direction,  while  the  two  inner 
boundary  curves  are  oriented  clockwise. 

Green’s  Theorem ,  first  formulated  by  George  Green  to  use  in  his  seminal  study  of 
partial  differential  equations  and  potential  theory,  relates  certain  double  integrals  over  a 
domain  to  line  integrals  around  its  boundary.  It  should  be  viewed  as  the  extension  of  the 
Fundamental  Theorem  of  Calculus  to  double  integrals. 


Theorem  6.13.  Let  v(x)  be  a  smooth t  vector  Geld  deGned  on  a  bounded  domain 
12  C  IR2.  Then  the  line  integral  of  v  around  the  boundary  <912  equals  the  double  integral  of 
its  curl  over  the  domain: 


V  x  v  dx  dy 


(D  v  •  dx, 

Jon 


(6.75) 


t  To  be  precise,  we  require  v  to  be  continuously  differentiable  within  the  domain,  and  contin¬ 
uous  up  to  the  boundary,  so  v  G  C°(12)  D  C1(12),  where  12  =  12  U  <912  denotes  the  closure  of  the 
domain  12. 
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or,  in  full  detail, 


dx  dy 


(b  v±  dx  +  v2  dy  . 

Jan 


(6.76) 


Example  6.14. 

v  =  ( y,  0  )T.  Since  V 


Let  us  apply  Green’s  Theorem  6.13  to  the  particular  vector  held 
x  v  =  —  1,  we  obtain 


( b  ydx—  //  (— 1)  dx  dy  =  —  area  12.  (6.77) 

Jan  J  Jn 

This  means  that  we  can  determine  the  area  of  a  planar  domain  by  computing  the  negative 
of  the  indicated  line  integral  around  its  boundary. 

For  later  purposes,  we  rewrite  the  basic  Green  identity  (6.75)  in  an  equivalent  “diver¬ 
gence  form” .  Given  a  planar  vector  held  v  =  ( v1 ,  v2  )  ,  let 


denote  the  “perpendicular”  vector  held.  We  note  that  its  curl 


V  x  = 


dvi  dv< 


—  V  •  v 


dx  dy 

coincides  with  the  divergence  of  the  original  vector  held. 

When  we  replace  v  in  Green’s  identity  (6.75)  by  v^,  the  result  is 


(6.78) 


(6.79) 


V  •  v  dx  dy 


V  x  dx  dy 


(b  v  •  n  ds, 

Jan 


where  n  denotes  the  unit  outwards  normal  to  the  boundary  of  our  domain,  while  ds  denotes 
the  arc-length  element  along  the  boundary  curve.  This  yields  the  divergence  form  of  Green’s 
Theorem: 


V  •  v  dx  dy 


(b  v  •  n  ds. 

Jan 


(6.80) 


Physically,  if  v  represents  the  velocity  vector  held  of  a  steady-state  fluid  how,  then  the 
line  integral  in  (6.80)  represents  the  net  fluid  flux  out  of  the  region  12.  As  a  result,  the 
divergence  V  •  v  represents  the  local  change  in  area  of  the  fluid  at  each  point,  which  serves 
to  justify  our  earlier  statement  on  incompressibility. 

Consider  next  the  product  vector  held  u  v  obtained  by  multiplying  a  vector  held  v  by 
a  scalar  held  u.  An  elementary  computation  proves  that  its  divergence  is 


V  •  (wv)  =  u  V  •  v  +  Vn  •  v.  (6.81) 

Replacing  v  by  uv  in  the  divergence  formula  (6.80),  we  deduce  what  is  usually  referred  to 
as  Green’s  formula 


/  /  ( u  V  •  v  +  \7u  •  v )  dx  dy 


n)  ds. 


(6.82) 


J  Jn  Jan 

which  is  valid  for  arbitrary  bounded  domains  12,  and  arbitrary  C1  scalar  and  vector  helds 
dehned  thereon.  Rearranging  the  terms  produces 


Vu  •  v  dx  dy 


mV  •  v  dx  dy. 


(6.83) 
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We  will  view  this  identity  as  an  integration  by  parts  formula  for  double  integrals.  Indeed, 
comparing  with  the  one-dimensional  integration  by  parts  formula 


‘b  b 

ur (x)  v{x)  dx  =  u{x)  v{x) 


a 


u{x)  v'(x )  dx, 


x  —  a 


a 


(6.84) 


we  observe  that  the  single  integrals  have  become  double  integrals;  the  derivatives  are  vector 
derivatives  (gradient  and  divergence),  while  the  boundary  contributions  at  the  endpoints 
of  the  interval  are  replaced  by  a  line  integral  around  the  entire  boundary  of  the  two- 
dimensional  domain. 

A  useful  special  case  of  (6.82)  is  that  in  which  v  =  Vx  is  the  gradient  of  a  scalar  field 
v.  Then,  in  view  of  (6.74),  Green’s  formula  (6.82)  becomes 


( u  Av  +  Vu  •  Vv )  dx  dy  = 


dv 

—  ds, 

an 


(6.85) 


where  dv/dn  =  Vv  •  n  is  the  normal  derivative  of  the  scalar  field  v  on  the  boundary  of  the 
domain.  In  particular,  setting  v  =  u,  we  deduce 


(6.86) 


As  an  application,  we  establish  a  basic  uniqueness  theorem  for  solutions  to  the  boundary 
value  problems  for  the  Poisson  equation: 


Theorem  6.15.  Suppose  u  and  u  both  satisfy  the  same  inhomogeneous  Dirichlet  or 
mixed  boundary  value  problem  for  the  Poisson  equation  on  a  connected ,  bounded  domain 
12.  Then  u  —  u.  On  the  other  hand ,  ifu  and  u  satisfy  the  same  Neumann  boundary  value 
problem ,  then  u  —  u  P  c  for  some  constant  c. 


Proof :  Since,  by  assumption,  —  A u  —  f  —  —  A u,  the  difference  v  —  u  —  u  satisfies 
the  Laplace  equation  Av  =  0  in  12,  and  satisfies  the  homogeneous  boundary  conditions. 
Therefore,  applying  (6.86)  to  v,  we  find 


// 

Vv 

/  Jn 

dv 

v  — —  ds  =  0, 

an 

since,  at  every  point  on  the  boundary,  either  v  =  0  or  dv/dn  =  0.  Since  the  integrand  is 
continuous  and  everywhere  nonnegative,  we  immediately  conclude  that  ||  Vv  ||2  =  0,  and 
hence  Vv  =  0  throughout  12.  On  a  connected  domain,  the  only  functions  annihilated  by 
the  gradient  operator  are  the  constants: 

Lemma  6.16.  If  v(x,y)  is  a  C1  function  dehned  on  a  connected  domain  12  C  M2, 
then  Vv  =  0  if  and  only  if  v(x,y)  =  c  is  a  constant. 

Proof :  Let  a,  b  be  any  two  points  in  12.  Then,  by  connectivity,  we  can  find  a  curve  C 
connecting  them.  The  Fundamental  Theorem  for  line  integrals,  [8,  108],  states  that 


/  Vv  •  dx  =  v(h)  —  x(a). 

Jc 

Thus,  if  Vv  =  0,  then  v(b)  =  v(a)  for  all  a,  b  G  12,  which  implies  that  v  must  be  con¬ 
stant.  Q.E.D. 
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Returning  to  onr  proof,  we  conclude  that  u  =  u  +  v  =  u  +  c,  which  proves  the  result 
in  the  Neumann  case.  In  the  Dirichlet  or  mixed  problems,  there  is  at  least  one  point  on 
the  boundary  where  v  =  0,  and  hence  the  only  possible  constant  is  v  =  c  =  0,  proving  that 
u  —  u.  Q.E.D. 


Thus,  the  Dirichlet  and  mixed  boundary  value  problems  admit  at  most  one  solution, 
while  the  Neumann  boundary  value  problem  has  either  no  solutions  or  infinitely  many 
solutions.  Proof  of  existence  of  solutions  is  more  challenging,  and  will  be  left  to  a  more 
advanced  text,  e.g.,  [35,  44,  61,  70]. 

If  we  subtract  from  formula  (6.85)  the  formula 


obtained  by  interchanging  u  and  u,  we  obtain  the  identity 


//  ( u  Av  —  v  A u )  dx  dy  =  (b 

Jn  Jan 


u 


dv 

dn 


—  v 


du 

dn 


ds . 


which  will  play  a  major  role  in  our  analysis  of  the  Poisson  equation.  Setting  v 
yields 

du 


A  u  dx  dy  = 


ds. 


Q 


an  dn 

Suppose  u  solves  the  Neumann  boundary  value  problem 


(6.87) 


(6.88) 

1  in  (6.87) 

(6.89) 


—  A  u  =  /,  in  12 
Then  (6.89)  requires  that 


on  <912. 


f>  hds  =  0,  (6.90) 

an 

which  thus  forms  a  necessary  condition  for  the  existence  of  a  solution  u  to  the  inhomo¬ 
geneous  Neumann  boundary  value  problem.  Physically,  if  u  represents  the  equilibrium 
temperature  of  a  plate,  then  the  integrals  in  (6.89)  measure  the  net  gain  or  loss  in  heat  en¬ 
ergy  due  to,  respectively,  the  external  heat  source  and  the  heat  flux  through  the  boundary. 
Equation  (6.90)  is  telling  us  that,  for  the  plate  to  remain  in  thermal  equilibrium,  there  can 
be  no  net  change  in  its  total  heat  energy. 


/  dx  dy  + 


The  Two-Dimensional  Delta  Function 

Now  let  us  return  to  the  business  at  hand  —  solving  the  Poisson  equation  on  a  bounded 
domain  12  C  M2.  We  will  subject  the  solution  to  either  homogeneous  Dirichlet  boundary 
conditions  or  homogeneous  mixed  boundary  conditions.  (As  we  just  noted,  the  Neumann 
boundary  value  problem  does  not  admit  a  unique  solution,  and  hence  does  not  possess  a 
Green’s  function.)  The  Green’s  function  for  the  boundary  value  problem  arises  when  the 
forcing  function  is  a  unit  impulse  concentrated  at  a  single  point  in  the  domain. 

Thus,  our  first  task  is  to  establish  the  proper  form  for  a  unit  impulse  in  our  two- 
dimensional  context.  The  delta  function  concentrated  at  a  point  £  =  (£,  rf)  E  IR2  is  denoted 
by 


s(t,V)(x>  y )  =  <^(x)  =  <Kx  -  0  =  Kx  -  C  y  -  v), 


(6.91) 
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Figure  6.11. 


Gaussian  functions  converging  to  the  delta  function. 


and  is  designed  so  that 

<^(x)  =  0,  JJ  5^v)(x,y)dxdy  =  1,  £  G  C.  (6.92) 

In  particular,  5(x,y)  =  50(x,y)  represents  the  delta  function  at  the  origin.  As  in  the 
one-dimensional  version,  there  is  no  ordinary  function  that  satisfies  both  criteria;  rather, 
5(x1y)  is  to  be  viewed  as  the  limit  of  a  sequence  of  more  and  more  highly  concentrated 
functions  gn(x,?/),  with 


lim  gn{x,y)  =  0,  for  (x,  y)  ±  (0, 0),  while  //  gn(x,y)dxdy  =  1. 

JWOO  JJ  r2 

A  good  example  of  a  suitable  sequence  is  provided  by  the  radial  Gaussian  functions 


gn(x,y) 


e~n(.x2+y2) 

i r 


(6.93) 


As  plotted  in  Figure  6.11,  as  n  -E  oo,  the  Gaussian  profiles  become  more  and  more  con¬ 
centrated  near  the  origin,  while  maintaining  a  unit  volume  underneath  their  graphs.  The 
fact  that  their  integral  over  IR2  equals  1  is  a  consequence  of  (2.99). 

Alternatively,  one  can  assign  the  delta  function  a  dual  interpretation  as  the  linear 
functional 


L 


(£^)  - 


u 


=  L 


u 


u(£)  =  u(^v) 


(6.94) 


which  assigns  to  each  continuous  function  u  E  C°(f2)  its  value  at  the  point  £  =  (£,77)  E  O. 
Then,  using  the  L2  inner  product 


u ,  v 


u(x,  y)  v(x,  y)  dx  dy 


(6.95) 


Q 


between  scalar  fields  u,  v  E  C°(0),  we  formally  identify  the  linear  functional  ^  with 
the  delta  “function”  by  the  integral  formula 


(E??)  ’ 


u 


JJ  S^,v)(x,y)u(x,y)dxdy 


0, 


(Cv)  € 

Xv)gR2\U. 


(6.96) 
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for  any  u  E  C°(fl).  As  in  the  one-dimensional  version,  we  will  avoid  defining  the  integral 
when  the  delta  function  is  concentrated  at  a  boundary  point  of  the  domain. 

Since  double  integrals  can  be  evaluated  as  repeated  one-dimensional  integrals,  we  can 
conveniently  view 

=  si(x)sr1(y)  =  s(x  -  0  s(y  - 1 1)  (6-97) 

as  the  product’!'  0f  a  pair  of  one-dimensional  delta  functions.  Indeed,  if  the  impulse  point 

(Z,V)eR={a  <  x  <  6,  c  <  y  <  d}  C  O 
is  contained  in  a  rectangle  that  lies  within  the  domain,  then 


n 


5^r]){x,y)u{x,y)  dx  dy 


R 


5^^(x,y)u(x,y)  dx  dy 


/  /  5 [x  —  £)  5{y  —  rf)  u{x,y)  dy  \dx=  5{x  —  £)  u(x,  rf)  dx  =  rf). 

J  a  c  J  J  a 


The  Green’s  Function 


As  in  the  one-dimensional  context,  the  Green’s  function  is  defined  as  the  solution  to  the 
inhomogeneous  differential  equation  when  subject  to  a  concentrated  unit  delta  impulse  at 
a  prescribed  point  $  =  (£,  rf)  E  Q  inside  the  domain.  In  the  current  situation,  the  Poisson 
equation  takes  the  form 


—  Au  =  5 


or,  explicitly, 


d2u  d2 


u 


dx 2  dy[ 


6(x  -  l)  6{y  -  rj) 


(6.98) 


The  function  u{x^y)  is  also  subject  to  some  homogeneous  boundary  conditions,  e.g.,  the 
Dirichlet  conditions  u  =  0  on  dfl.  The  resulting  solution  is  called  the  Green’s  function  for 
the  boundary  value  problem,  and  written 


G$(x)  =  G(x;  0  =  G(x,  y- 77).  (6.99) 

Once  we  know  the  Green’s  function,  the  solution  to  the  general  Poisson  boundary 
value  problem 

—  A  u  —  f  in  O,  u  —  0  on  dVt  (6.100) 

is  reconstructed  as  follows.  We  regard  the  forcing  function 


f(x,  y)  =  [ [  S(x-l)5(y-  Tj)f(C  y)  d, *  dy 
J  JQ 

as  a  superposition  of  delta  impulses,  whose  strength  equals  the  value  of  /  at  the  impulse 
point.  Linearity  implies  that  the  solution  to  the  boundary  value  problem  is  the  correspond¬ 
ing  superposition  of  Green’s  function  responses  to  each  of  the  constituent  impulses.  The 
net  result  is  the  fundamental  superposition  formula 


u{x,y)  =11  G(x,  y;  y)  /(£,  y)  d£  dy 
J  Jn 


(6.101) 


^  This  is  an  exception  to  our  earlier  injunction  not  to  multiply  delta  functions.  Multiplication 
is  allowed  when  they  depend  on  different  variables. 
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for  the  solution  to  the  boundary  value  problem.  Indeed, 

-A  u(x,y)=  [  [  -AG(x,y;£,r))f(£,r))d£dr) 

J  Jn 

5(x-£,y-r))  /(£,  v)  d£  dr]  =  /( x,  y ), 

while  the  fact  that  G(x,  y;  £,  rf)  =  0  for  all  (x,  y)  G  90  implies  that  u(x,  y)  =  0  on  the 
boundary. 

The  Green’s  function  inevitably  turns  out  to  be  symmetric  under  interchange  of  its 
arguments: 

G(£,  77;  x,  y)  =  G(x,  y;  £,  rj).  (6.102) 

As  in  the  one-dimensional  case,  symmetry  is  a  consequence  of  the  self-adjointness  of  the 
boundary  value  problem,  and  will  be  explained  in  full  in  Chapter  9.  Symmetry  has  the 
following  intriguing  physical  interpretation:  Let  x,  £  G  O  be  any  two  points  in  the  domain. 
We  apply  a  concentrated  unit  force  to  the  membrane  at  the  first  point  and  measure  its 
deflection  at  the  second;  the  result  is  exactly  the  same  as  if  we  applied  the  impulse  at 
the  second  point  and  measured  the  deflection  at  the  first.  (Deflections  at  other  points 
in  the  domain  will  typically  have  no  obvious  relation  with  one  another.)  Similarly,  in 
electrostatics,  the  solution  u(x,  y)  is  interpreted  as  the  electrostatic  potential  for  a  system 
of  charges  in  equilibrium.  A  delta  function  corresponds  to  a  point  charge,  e.g.,  an  electron. 
The  symmetry  property  says  that  the  electrostatic  potential  at  x  due  to  a  point  charge 
placed  at  position  £  is  exactly  the  same  as  the  potential  at  £  due  to  a  point  charge  at  x. 
The  reader  may  wish  to  meditate  on  the  physical  plausibility  of  these  striking  facts. 

Unfortunately,  most  Green’s  functions  cannot  be  written  down  in  closed  form.  One 
important  exception  occurs  when  the  domain  is  the  entire  plane:  O  =  M2.  The  solution 
to  the  Poisson  equation  (6.98)  is  the  free-space  Green’s  function  G0(x,  y\  £,  rf)  =  G0(x;£), 
which  measures  the  effect  of  a  unit  impulse,  concentrated  at  £,  throughout  two-dimensional 
space,  e.g.,  the  gravitational  potential  due  to  a  point  mass  or  the  electrostatic  potential 
due  to  a  point  charge.  To  motivate  the  construction,  let  us  appeal  to  physical  intuition. 
First,  since  the  concentrated  impulse  is  zero  when  x  ^  £,  the  function  must  solve  the 
homogeneous  Laplace  equation 


—  A G0  =  0  for  all  (6.103) 

Second,  since  the  Poisson  equation  is  modeling  a  homogeneous,  uniform  medium,  in  the 
absence  of  boundary  conditions  the  effect  of  a  unit  impulse  should  depend  only  on  the 
distance  from  its  source.  Therefore,  we  expect  G0  to  be  a  function  of  the  radial  variable 
alone: 


G0(x,  y;  C  rj)  =  v(r),  where 


r  = 


X  -  £  II  =  y/(x-  C2  +  {y-  rfC  • 


According  to  (4.113),  the  only  radially  symmetric  solutions  to  the  Laplace  equation  are 


v(r)  =  a  +  61ogr,  (6.104) 

where  a  and  b  are  constants.  The  constant  term  a  has  zero  derivative,  and  so  cannot 
contribute  to  the  delta  function  singularity.  Therefore,  we  expect  the  required  solution  to 
be  a  multiple  of  the  logarithmic  term.  To  determine  the  multiple,  consider  a  closed  disk  of 
radius  £  >  0  centered  at  £, 

De  =  {0<r<£:}  =  {||x-^||  <e}, 
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with  circular  boundary 


C£  =  dDe  —  {r  —  ||  x  —  £||  =  e}  =  {(<^  +  e  cos  0,  y  +  €  sin  6)  \  —  n  <  0  <  tt  } 
Then,  by  (6.74)  and  the  divergence  form  (6.80)  of  Green’s  Theorem, 


1  = 


D, 


5{x,y)  dx  dy  =  —b  /  /  A  (log  r)dxdy  =  —b  /  /  V  •  V  (log  r)  dx  dy 

J  J Dr  J  J 


<9(logr) 


d(log  r) 


=  -b  <b  — - - ds  =  —b 

(j  (9n  J  (j  OT 


ds  =  —b  (j)  —ds  =  —b(  d&  =  —  2nb. 

c.  r 


(6.105) 


■7 T 


and  hence  b  =  — 1/(2  tt)  .  We  conclude  that  the  free-space  Green’s  function  should  have  the 
logarithmic  form 


1  1 

G0(x,yR,y)  =  -  —  logr  =  -  —  log  ||  x  -  £ 


-  j-  i°g  [  (x  -  02  +  (y  -  d)2  ] 

47 r 


2n  2n 

(6.106) 

A  fully  rigorous,  albeit  more  difficult,  justification  of  (6.106)  comes  from  the  following 
important  result,  known  as  Green’s  representation  formula. 

Theorem  6.17.  Let  O  c  M2  be  a  bounded  domain ,  with  piecewise  C1  boundary  <90. 
Suppose  u  £  C2(0)  fl  C1(0).  Then ,  for  any  (x,y)  £  O, 


u(x,  y)  =  —  //  G0(x,yX,y)Au(£,y)d£dy 

J  Jn 


+  I  G0(x,  y;  £,  y)  (£,  y)  -  (x,  y;  £,  y)  u{ £,  y)  j  ds, 

(6.107) 

where  the  Laplacian  and  the  normal  derivatives  on  the  boundary  are  all  taken  with  respect 
to  the  integration  variables  £  =  (£,77). 

In  particular,  if  both  u  and  <9u/<9n  vanish  on  90,  then  (6.107)  reduces  to 

u(x,  y)  =  —  //  G0(x,y;€,y)Au(t,y)d£dy. 

J  J R2 

Invoking  the  definition  of  the  delta  function  on  the  left-hand  side  and  formally  applying 
the  Green  identity  (6.88)  to  the  right-hand  side  produces 

[[  S{x-OS(y-y)u(^,y)d^dy=  [[  -  AG0(x,  y,  £,  rj)  u{£,  rj)  dy.  (6.108) 

J  Jr 2  J  Jr 2 


It  is  in  this  dual  sense  that  we  justify  the  desired  formula 

-AG0(x-,e  =  A  (log  ||  x  —  £  || )  =  S(x  —  £) 

L  7T 


(6.109) 


Proof  of  Theorem  6.17 :  We  hrst  note  that,  even  though  G0(x,  £)  has  a  logarithmic 
singularity  at  x  =  £,  the  double  integral  in  (6.107)  is  finite.  Indeed,  after  introducing  polar 
coordinates  =  x  +  rcos0,  77  =  y  +  rsin0,  and  recalling  d^drj  =  r  dr  d6 ,  we  see  that  it 
equals 

—  j  j  (r  logr)  An  dr  dd. 
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The  product  rlogr  is  everywhere  continuous  —  even  at  r  =  0  —  and  so,  provided  A u  is 
well  behaved,  e.g.,  continuous,  the  integral  is  finite.  There  is,  of  course,  no  problem  with 
the  line  integral  in  (6.107),  since  the  contour  does  not  go  through  the  singularity. 

Let  us  now  avoid  dealing  directly  with  the  singularity  by  working  on  a  subdomain 


fi£  =  s]\D£(x)  =  {^n 

obtained  by  cutting  out  a  small  disk 


x-£||  >£} 


r>£(x) 


x-£  ||  <e} 


of  radius  £  >  0  centered  at  x.  We  choose  €  sufficiently  small  in  order  that  Z7e(x)  C  fi,  and 
hence 

dVt£  =  dVt  U  C£ ,  where  Cfe  =  {||x  —  £  1 1  =  £  } 

is  the  circular  boundary  of  the  disk.  The  subdomain  is  represented  by  the  shaded 
region  in  Figure  6.12.  Since  the  double  integral  is  well  defined,  we  can  approximate  it  by 
integrating  over 


G0(x,y;£,r))  Au(£,r])d£dr)  =  lim 

£  — y  0 


G0(x,  y,  £,  rj)  A u{£,  rj)  d£  dr). 


(6.110) 


Since  G0  has  no  singularities  in  fl£,  we  are  able  to  apply  the  Green  formula  (6.85)  and  then 
(6.103)  to  evaluate 


a 


G0(x,y;€,r])  Au(£,rj)d£dr) 


an  \  G°^X’ y]  ^  ^  ^  ~~  Iht  ^  V]  ^  ^  ^  )  ds 

~  (^G0(x,y;C,v)  ~^r(x,y;C,r))u(C,v) 


(6.111) 


dn 


dn 


ds, 


where  the  line  integral  around  C£  is  taken  in  the  usual  counterclockwise  direction  —  the 
opposite  orientation  to  that  induced  by  its  status  as  part  of  the  boundary  of  f 1£.  Now,  on 
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the  circle  Ck, 


G0(x,y,£.,v)  =  - 


log  r 
2n 


r  —  e 


logs 

2n 


(6.112) 


while,  in  view  of  Exercise  6.3. 1 


dG0  1  d(logr) 

(x,y,£,v)  =  - 


dn 


2tt  dr 


r  —  e 


2  7TS 


(6.113) 


Therefore. 


dG 


o 


a 


dn 


(x,y,t,v)u(£,ri)ds  =  -  — —  (b  u(£,  y)  ds. 

2vre  Jc, 


which  we  recognize  as  minus  the  average  of  u  on  the  circle  of  radius  s.  As  e  — 0,  the  circles 
shrink  down  to  their  common  center,  and  so,  by  continuity,  the  averages  tend  to  the  value 
u(x,  y )  at  the  center;  thus, 


lim 


dG 


o 


(x,  y;  £,  y)  u(£,  y)ds  =  -  u(x,  y) 


e  — >  o  in  dn 

On  the  other  hand,  using  (6.112),  and  then  (6.89)  on  the  disk  De,  we  have 


(6.114) 


<f  G0(x,y;£,y)  (£,r))ds  =  -  log£ 

a  C  fr 


dn 


2tt 

logs 

2tt 


a 


du 

dn 


(£,»?)  ds 


I Jd  Au^,r]^d^dri  =  -  Gl°^)  Au^ 


where 


1 


Aue  =  o  2 
2t T£2 


a u(€,  y)  d£  dy 


D, 


is  the  average  of  A u  over  the  disk  D£.  As  above,  as  s  — 0,  the  averages  over  the  disks 
converge  to  the  value  at  their  common  center,  A u£  — )►  Au(x,y),  and  hence 


du 


Jim  ®  G0(x,  y\  £,  rj)  (£,  y)  ds  =  Jim  (-  S  logs)  A u£  =  0. 

v  Gc 


(6.115) 


In  view  of  (6.110, 114, 115),  the  s  — ?►  0  limit  of  (6.111)  is  exactly  the  Green  representation 
formula  (6.107).  Q.E.D. 

As  noted  above,  the  free  space  Green’s  function  (6.106)  represents  the  gravitational 
potential  in  empty  two-dimensional  space  due  to  a  unit  point  mass,  or,  equivalently,  the 
two-dimensional  electrostatic  potential  due  to  a  unit  point  charge  sitting  at  position  £.  The 
corresponding  gravitational  or  electrostatic  force  field  is  obtained  by  taking  its  gradient: 


F  =  VG0  = 


2i r 


I  X  -  i  ||2 


Its  magnitude 


2  7 r 


1 

x-£ 


is  inversely  proportional  to  the  distance  from  the  mass  or  charge,  which  is  the  two- 
dimensional  form  of  Newton’s  and  Coulomb’s  three-dimensional  inverse  square  laws. 


6.3  Green’s  Functions  for  the  Planar  Poisson  Equation 


253 


The  gravitational  potential  due  to  a  two-dimensional  mass,  e.g.,  a  flat  plate,  in  the 
shape  of  a  domain  12  C  IR2  is  obtained  by  superimposing  delta  function  sources  with 
strengths  equal  to  the  density  of  the  material  at  each  point.  The  result  is  the  potential 
function 

u(x,y)  =  -  T  J  j  p(£,v)  log  [(x-02  +  (y-  v)2]  d£dr),  (6.116) 

in  which  p(£,  rj)  denotes  the  density  at  position  (£,  77)  E  12. 

Example  6.18.  The  gravitational  potential  due  to  a  circular  disk  D  =  {x2  +  y2  <  1} 
of  unit  radius  and  unit  density  p  =  1  is 

u(x,  y)  =  -  T  JJ  log  [(x  -  02  +  (y  -  y)2  ]  d£  drj.  (6.117) 

A  direct  evaluation  of  this  double  integral  is  not  so  easy.  However,  we  can  write  down  the 
potential  in  closed  form  by  recalling  that  it  solves  the  Poisson  equation 


—  Au  = 


<  1, 

>  1. 


(6.118) 


Moreover,  u  is  clearly  radially  symmetric,  and  hence  a  function  of  r  alone.  Thus,  in  the 
polar  coordinate  expression  (4.105)  for  the  Laplacian,  the  6  derivative  terms  vanish,  and 
so  (6.118)  reduces  to 

d2u  1  du  (  —  1,  r  <  1, 

dr 2  r  dr  \  0,  r  >  1, 

which  is  effectively  a  first-order  linear  ordinary  differential  equation  for  du/dr.  Solving 
separately  on  the  two  subintervals  produces 


1  O 

a  +  b  log  r  —  j  r  ,  r  <  1 , 
c  +  dlogr,  r  >  1, 


where  a,  6,  c,  d  are  constants.  Continuity  of  u(r)  and  u\r)  at  r  =  1  implies  c  =  a  — 

-I 

d  =  b  —  7j.  Moreover,  the  potential  for  a  non-concentrated  mass  cannot  have  a  singularity 
at  the  origin,  and  so  b  =  0.  Direct  evaluation  of  (6.117)  at  x  =  y  =  0,  using  polar 
coordinates,  proves  that  a—\.  We  conclude  that  the  gravitational  potential  (6.117)  due 
to  a  uniform  disk  of  unit  radius,  and  hence  total  mass  (area)  tt,  is,  explicitly, 


i(l-r2)  =  |(l-x2-7/2),  x2  +  y2<  1, 

-  |  log r  =  —\  log(x2  +  t/2),  x2  +  y2  >  1. 


(6.119) 


Observe  that,  outside  the  disk,  the  potential  is  exactly  the  same  as  the  logarithmic  potential 
due  to  a  point  mass  of  magnitude  tt  located  at  the  origin.  Consequently,  the  gravitational 
force  field  outside  a  uniform  disk  is  the  same  as  if  all  its  mass  were  concentrated  at  the 
origin. 


With  the  free-space  logarithmic  potential  in  hand,  let  us  return  to  the  question  of  find¬ 
ing  the  Green’s  function  for  a  boundary  value  problem  on  a  bounded  domain  12  C  IR2.  Since 
the  logarithmic  potential  (6.106)  is  a  particular  solution  to  the  Poisson  equation  (6.98),  the 
general  solution,  according  to  Theorem  1.6,  is  given  by  u  =  G0  +  z,  where  £  is  an  arbitrary 
solution  to  the  homogeneous  equation  A z  =  0,  i.e.,  an  arbitrary  harmonic  function.  Thus, 
constructing  the  Green’s  function  has  been  reduced  to  the  problem  of  finding  the  harmonic 
function  z  such  that  G  =  G0  -\-  z  satisfies  the  desired  homogeneous  boundary  conditions. 
Let  us  explicitly  formulate  this  result  for  the  (inhomogeneous)  Dirichlet  problem. 
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Theorem  6.19.  The  Green’s  function  for  the  Dirichlet  boundary  value  problem  for 
the  Poisson  equation  on  a  bounded  domain  C  IR2  has  the  form 


G(x,  y;  £,  y)  =  G0( x,  y;  £,  y)  +  z(x,  y;  £,  y), 


(6.120) 


where  the  first  term  is  the  logarithmic  potential  (6.106),  while ,  for  each  (£,77)  E  fl,  the 
second  term  is  the  harmonic  function  that  solves  the  boundary  value  problem 


Az  =  0 


on 


n, 


z(x,y;£,y)  = —\og[(x  -  £)2  +  (y  -  y)2]  for  (x,y)  e  dtt. 

47T 

If  u(x ,  y)  is  a  solution  to  the  inhomogeneous  Dirichlet  problem 


-Au  =  f, 


xgO, 


u  =  h, 


x  E  dft. 


then 


dG 


u(x,  y)  =  Jj  G(x,  y,  £,  y)  /(£,  y)  d£  dy  -  jf  ^  (x,  y;  £,  y)  h(£,  y)  ds , 

where  the  normal  derivative  of  G  is  taken  with  respect  to  (£,rj)  E  dfl. 

Proof :  To  show  that  (6.120)  is  the  Green’s  function,  we  note  that 


-  AG  —  -  A G0  -  A z  — 


m 


n. 


while 


G(x,  y;  £,  y)  =  G0(x,  y;  £,  y)  +  z(x,  y,  £,y)  =  0  on 


i:)Q 


(6.121) 


(6.122) 


(6.123) 


(6.124) 


(6.125) 


Next,  to  establish  the  solution  formula  (6.123),  since  both  z  and  u  are  C2,  we  can  use 
(6.88)  (with  v  =  z,  keeping  in  mind  that  =  0)  to  establish 


0  =  - 


z(x,y;£,  y)  A u(£,y)d£dy 


n 


du  dz  \ 

+  %n  l  z(x,y,£,v)  ~  g^V,y;£,y)u(£,y)  j  ds. 

Adding  this  to  Green’s  representation  formula  (6.107),  and  using  (6.125),  we  deduce  that 


u(x,  y)  =  -  /  /  G(x,  y,  £,  y)  A u(£,  y)  d£dy-  f 

J  Jn  Jan 


dG(x,y;£,  y) 

dn 


u(£,y)  ds, 


which,  given  (6.122),  produces  (6.123) 


Q.E.D. 


The  one  subtle  issue  left  unresolved  is  the  existence  of  the  solution.  Read  properly, 
Theorem  6.19  states  that  if  a  classical  solution  exists,  then  it  is  necessarily  given  by  the 
Green’s  function  formula  (6.123).  Proving  existence  of  the  solution  —  and  also  the  existence 
of  the  Green’s  function,  or  equivalently,  the  solution  z  to  (6.121)  —  requires  further  in- 
depth  analysis,  lying  beyond  the  scope  of  this  text.  In  particular,  to  guarantee  existence, 
the  underlying  domain  must  have  a  reasonably  nice  boundary,  e.g.,  a  piecewise  smooth 
curve  without  sharp  cusps.  Interestingly,  lack  of  regularity  at  sharp  cusps  in  the  boundary 
underlies  the  electromagnetic  phenomenon  known  as  St.  Elmo’s  fire,  cf.  [121].  Extensions 
to  irregular  domains,  e.g.,  those  with  fractal  boundaries,  is  an  active  area  of  contemporary 
research.  Moreover,  unlike  one-dimensional  boundary  value  problems,  mere  continuity  of 
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the  forcing  function  /  is  not  quite  sufficient  to  ensure  the  existence  of  a  classical  solution  to 
the  Poisson  boundary  value  problem;  differentiability  does  suffice,  although  this  assumption 
can  be  weakened.  We  refer  to  [61,70],  for  a  development  of  the  Perron  method  based  on 
approximating  the  solution  by  a  sequence  of  sub  solutions,  which,  by  definition,  solve  the 
differential  inequality  —  A u  <  /.  An  alternative  proof,  using  the  direct  method  of  the 
calculus  of  variations,  can  be  found  in  [35].  The  latter  proof  relies  on  the  characterization 
of  the  solution  by  a  minimization  principle,  which  we  discuss  in  some  detail  in  Chapter  9. 


Exercises 


0  6.3.1.  Let  CR  be  a  circle  of  radius  R  centered  at  the  origin  and  n  its  unit  outward  normal.  Let 
f(r,0)  be  a  function  expressed  in  polar  coordinates.  Prove  that  df/dn  =  df  /dr  on  CR. 

6.3.2.  Let  f(x)  >  0  be  a  continuous,  positive  function  on  the  interval  a  <  x  <  b.  Let  Q  be  the 
domain  lying  between  the  graph  of  f(x)  on  the  interval  [a,  b]  and  the  x-axis.  Explain  why 
(6.77)  reduces  to  the  usual  calculus  formula  for  the  area  under  the  graph  of  /. 

6.3.3.  Explain  what  happens  to  the  conclusion  of  Lemma  6.16  if  is  not  a  connected  domain. 

6.3.4.  Can  you  find  constants  cn  such  that  the  functions  gn  (x,y)  =  cn[l  +n2(x2  +  y2)}  1 
converge  to  the  two-dimensional  delta  function:  gn(x,y)  S(x,y)  as  n  ^  oo? 

0  6.3.5.  Explain  why  the  two-dimensional  delta  function  satisfies  the  scaling  law 

5((3x,(3y)  =  jp  5(x,y),  for  /3  ^  0. 


0  6.3.6.  Write  out  a  polar  coordinate  formula,  in  terms  of  S(r  —  r0)  and  5(0 
dimensional  delta  function  5(x  —  x0,  y  —  y0)  =  5 (x  —  x0)  5 (y  —  yQ). 

6.3.7.  True  or  false:  4(x)  =  4(||  x  ||). 


0O),  for  the  two- 


0  6.3.8.  Suppose  that  £  =  f(x,y ),  g  =  g(x,y)  defines  a  one-to-one  C1  map  from  a  domain 

r\  r\ 

D  C  R  to  the  domain  H  =  {  (£,  77)  =  (/(#,  y),  g{x,  y))  \  (x,y)  D}  C  R  ,  and  has  nonzero 
Jacobian  determinant:  J{x,y)  =  fxgy  —  fy9x  7^  0  for  all  (x,y)  G  D.  Suppose  further  that 

(0,0)  =  (f(xQ,y0),g(x0,y0))  G  H  for  (x0,y0)  G  D.  Prove  the  following  formula  governing 
the  effect  of  the  map  on  the  two-dimensional  delta  function: 


S(f(x,y),g(x,y)) 


S(x  -x0,y-  y0) 

I  J(xO’Vo)  I 


(6.126) 


6.3.9.  Suppose  f(x,  y) 


1,  3x  —  2 y  >  1, 
0,  3x  —  2 y  <  1. 


the  sense  of  generalized  functions. 


Compute  its  partial  derivatives 


9/ 

dx 


and 


6.3.10.  Find  a  series  solution  to  the  rectangular  boundary  value  problem  (4.91-92)  when  the 
boundary  data  f(x)  =  5(x  —  ^)  is  a  delta  function  at  a  point  0  <  £  <  a.  Is  your  solution 
infinitely  differentiable  inside  the  rectangle? 

6.3.11.  Answer  Exercise  6.3.10  when  f(x)  =  5' {x  —  £)  is  the  derivative  of  the  delta  function. 

6.3.12.  A  1  meter  square  plate  is  subject  to  the  Neumann  boundary  conditions  du/d n  =  1  on 
its  entire  boundary.  What  is  the  equilibrium  temperature?  Explain. 


6.3.13.  A  conservation  law  for  an  equilibrium  system  in  two  dimensions  is,  by  definition,  a  di¬ 
vergence  expression 


dX  dY 
dx  dy 


0 


(6.127) 
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that  vanishes  for  all  solutions. 

(a)  Given  a  conservation  law  prescribed  by  v  =  (A,  Y)  defined  on  a  simply  connected 


domain  D ,  show  that  the  line  integral 


v  •  n  ds 


Y  dx  is  path-independent, 


meaning  that  its  value  depends  only  on  the  endpoints  of  the  curve  C . 

(b)  Show  that  the  Laplace  equation  can  be  written  as  a  conservation  law,  and  write  down 
the  corresponding  path-independent  line  integral. 

Note :  Path-independent  integrals  are  of  importance  in  the  study  of  cracks,  dislocations,  and 
other  material  singularities,  [49], 


0  6.3.14.  In  two-dimensional  dynamics,  a  conservation  law  is  an  equation  of  the  form 

=  0, 


dT  dX  dY 
+  -w-  + 


(6.128) 


dt  dx  dy 

in  which  T  is  the  conserved  density ,  while  v  =  (A,  T)  represents  the  associated  flux. 

(a)  Prove  that,  on  a  bounded  domain  C  R2,  the  rate  of  change  of  the  integral  T  dx  dy 
of  the  conserved  density  depends  only  on  the  flux  through  the  boundary  dQ. 

(b)  Write  the  partial  differential  equation  ut  +  uux  +  uuy  =  0  as  a  conservation  law.  What 
is  the  integrated  version? 


The  Method  of  Images 

The  preceding  analysis  exposes  the  underlying  form  of  the  Green’s  function,  but  we  are 
still  left  with  the  determination  of  the  harmonic  component  z(x,y)  required  to  match  the 
logarithmic  potential  boundary  values,  cf.  (6.121).  We  will  discuss  two  principal  analytic 
techniques  employed  to  produce  explicit  formulas.  The  first  is  an  adaptation  of  the  method 
of  separation  of  variables,  which  leads  to  infinite  series  expressions.  We  will  not  dwell  on 
this  approach  here,  although  a  couple  of  the  exercises  ask  the  reader  to  work  through  some 
of  the  details;  see  also  the  discussion  leading  up  to  (9.110).  The  second  is  the  Method 
of  Images ,  which  will  be  developed  in  this  section.  Another  approach  is  based  on  the 
theory  of  conformal  mapping ;  it  can  be  found  in  books  on  complex  analysis,  including 
[53,  98].  While  the  first  two  methods  are  limited  to  a  fairly  small  class  of  domains,  they 
extend  to  higher-dimensional  problems,  as  well  as  to  certain  other  types  of  elliptic  boundary 
value  problems,  whereas  conformal  mapping  is,  unfortunately,  restricted  to  two-dimensional 
problems  involving  the  Laplace  and  Poisson  equations. 

We  already  know  that  the  singular  part  of  the  Green’s  function  for  the  two-dimensional 
Poisson  equation  is  provided  by  a  logarithmic  potential.  The  problem,  then,  is  to  construct 
the  harmonic  part,  called  z(x,y)  in  (6.120),  so  that  the  sum  has  the  correct  homogeneous 
boundary  values,  or,  equivalently,  so  that  z(x,y)  has  the  same  boundary  values  as  the 
logarithmic  potential.  In  certain  cases,  z(x,y )  can  be  thought  of  as  the  potential  induced 
by  one  or  more  hypothetical  electric  charges  (or,  equivalently,  gravitational  point  masses) 
that  are  located  outside  the  domain  12,  arranged  in  such  a  manner  that  their  combined 
electrostatic  potential  happens  to  coincide  with  the  logarithmic  potential  on  the  boundary 
of  the  domain.  The  goal,  then,  is  to  place  image  charges  of  suitable  strengths  in  the 
appropriate  positions. 

Here,  we  will  only  consider  the  case  of  a  single  image  charge,  located  at  a  position 
rj  ^  12.  We  scale  the  logarithmic  potential  (6.106)  by  the  charge  strength,  and,  for  added 
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Figure  6.13.  Method  of  Images  for  the  unit  disk. 


flexibility,  include  an  additional  constant  - 

z(x,  y)  =  a  log  ||  x  —  77 

The  function  z(x,y)  is  harmonic  inside  12, 
everywhere  except  at  the  external  singularity 
then,  for  each  point  £  £  12,  we  must  find  a 
constants  a,  b  £  M  such  that’*’ 


the  charge’s  potential  baseline: 

+  6,  77  £  IR1 2  \  12. 

since  the  logarithmic  potential  is  harmonic 
77.  For  the  Dirichlet  boundary  value  problem, 
corresponding  image  point  77  £  IR2  \  12  and 


log  ||  x- | 


a  log  ||  x 


77  ||  +  b 


for  all  x  £  <912, 


or,  equivalently, 


A 


x  —  77 


for  all  x  £  <912, 


(6.129) 


where  A  =  eb.  For  each  fixed  £,77,  A,  a,  the  equation  in  (6.129)  will,  typically,  implicitly 
prescribe  a  plane  curve,  but  it  is  not  clear  that  one  can  always  arrange  that  these  curves 
all  coincide  with  the  boundary  of  onr  domain. 

To  make  further  progress,  we  appeal  to  a  geometric  construction  based  on  similar 
triangles.  Let  ns  select  77  =  to  be  a  point  lying  on  the  ray  through  Its  location 
is  chosen  so  that  the  triangle  with  vertices  0,x,77  is  similar  to  the  triangle  with  vertices 
0,£,x,  noting  that  they  have  the  same  angle  at  the  common  vertex  0  —  see  Figure  6.13. 
Similarity  requires  that  the  triangles’  corresponding  sides  have  a  common  ratio,  and  so 


1141 

1  

X 

x~  III 

X 

1  -  1 

V 

x  —  77 

(6.130) 


The  last  equality  implies  that  (6.129)  holds  with  a 


1.  Consequently,  if  we  choose 


then 


so  that 


1. 


(6.131) 


1  To  simplify  the  formulas,  we  have  omitted  the  1/(2  tv)  factor,  which  can  easily  be  reinstated 

at  the  end  of  the  analysis. 
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Figure  6.14.  Green’s  function  for  the  unit  disk. 


Thus  x  lies  on  the  unit  circle,  and,  as  a  result,  A  =  ||  £  ||  =  1/ 1 1 77  1 1 .  The  map  taking  a 
point  £  inside  the  disk  to  its  image  point  77  defined  by  (6.131)  is  known  as  inversion  with 
respect  to  the  unit  circle. 

We  have  now  demonstrated  that  the  potentials 


log  II  X  —  ^  ||  =  T  log(H£ 


27 r 


2n 
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-*711)  = 


27 r 


log 


II 1 

l€l 

|2X 

-  4 II 

141 

| 

x 


=  1. 


(6.132) 


have  the  same  boundary  values  on  the  unit  circle.  Consequently,  their  difference 


G(x;  0  =  -  —  log  ||  x  -  £  ||  +  —  log 


27 T 


27T 


II  1 

141 

|2X 

-411 

141 

| 

=  wlog 

z  7 r 


KI12x-£ 

£  ||  II  x  -  1 1 


(6.133) 


has  the  required  properties  for  the  Green’s  function  for  the  Dirichlet  problem  on  the  unit 
disk.  Writing  this  in  terms  of  polar  coordinates 


x  =  (r  cos  0,  r  sin  9) ,  £  =  (p  cos  0,  p  sin  (j>) , 


and  applying  the  Law  of  Cosines  to  the  triangles  in  Figure  6.13  produces  the  explicit 
formula 


G{r,0;p ,  <f>) 


1  +  r2p2  —  2rpcos(9  —  <f>) 
r2  +  p2  —  2rpcos(9  —  </>) 


(6.134) 


In  Figure  6.14  we  sketch  the  Green’s  function  for  the  Dirichlet  boundary  value  problem 
corresponding  to  a  unit  impulse  being  applied  at  a  point  halfway  between  the  center  and 
the  edge  of  the  disk.  We  also  require  its  radial  derivative 


dG  1  1  -  r2 

- (r  9-  p  <b)  = - 

dr  y  2tt  1  +  r2  —  2  r  cos(9  —  <j>)  ’ 


(6.135) 


which  coincides  with  its  normal  derivative  on  the  unit  circle.  Thus,  specializing  (6.123), 
we  arrive  at  a  solution  to  the  general  Dirichlet  boundary  value  problem  for  the  Poisson 
equation  in  the  unit  disk. 
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Figure  6.15.  The  Poisson  kernel. 


Theorem  6.20.  The  solution  to  the  inhomogeneous  Dirichlet  boundary  value  prob¬ 


lem 


—  A  u  =  /,  for  r  = 


x 


<  1 


u  =  h,  for  r  —  1, 


is,  when  expressed  in  polar  coordinates, 


u(r ,  0) 


1 

47T 


1 

f{p,4>)  log 


0 


1  +  r2  p2  —  2rpcos(9  —  <fi) 
r2  +  p2  —  2rpcos(9  —  (jf) 


p  dp  dcj) 


+ 


2ir  J_nh^  1  +r2  -  2rcos(<9  -  </>) 


(6.136) 


When  /  =  0,  formula  (6.136)  recovers  the  Poisson  integral  formula  (4.126)  for  the 
solution  to  the  Dirichlet  boundary  value  problem  for  the  Laplace  equation.  In  particular, 
the  boundary  data  h(0)  =  5(0  —  (jf),  corresponding  to  a  concentrated  unit  heat  source 
applied  to  a  single  point  on  the  boundary,  produces  the  Poisson  kernel 


u(r,  9) 


1  —  r2 

27r(l  +  r2  —  2r cos (9  —  (jf) ) 


(6.137) 


The  reader  may  enjoy  verifying  that  this  function  indeed  solves  the  Laplace  equation  and 
has  the  correct  boundary  values  in  the  limit  as  r  1. 


Exercises 

6.3.15.  A  circular  disk  of  radius  1  is  subject  to  a  heat  source  of  unit  magnitude  on  the  subdisk 
r  <  2-  Ls  boundary  is  kept  at  0°. 

(a)  Write  down  an  integral  formula  for  the  equilibrium  temperature. 

(b)  Use  radial  symmetry  to  find  an  explicit  formula  for  the  equilibrium  temperature. 
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6.3.16.  A  circular  disk  of  radius  1  meter  is  subject  to  a  unit  concentrated  heat  source  at  its 
center  and  has  completely  insulated  boundary.  What  is  the  equilibrium  temperature? 

G  6.3.17.  (a)  For  n  >  0,  find  the  solution  to  the  boundary  value  problem 

—  A u  =  ^  e_n  \  x2  +  y2  <  1,  u(x ,  y)  =  0,  x2  +  y2  =  1. 

(b)  Discuss  what  happens  in  the  limit  as  n  oo. 

G  6.3.18.  (a)  Use  the  Method  of  Images  to  construct  the  Green’s  function  for  a  half-plane  {y  >  0} 
that  is  subject  to  homogeneous  Dirichlet  boundary  conditions.  Hint :  The  image  point  is 
obtained  by  reflection,  (b)  Use  your  Green’s  function  to  solve  the  boundary  value  problem 

1 

—  A u  =  - - ,  y  >  0,  u(x ,  0)  =  0. 

1  +  y  v  ’  ' 

r\  r\ 

6.3.19.  Construct  the  Green’s  function  for  the  half-disk  D  =  <  1,  y  >  0}  when 

subject  to  homogeneous  Dirichlet  boundary  conditions.  Hint :  Use  three  image  points. 

6.3.20.  Prove  directly  that  the  Poisson  kernel  (6.137)  solves  the  Laplace  equation  for  all  r  <  1. 

C  6.3.21.  Provide  the  details  for  the  following  alternative  method  for  solving  the  homogeneous 
Dirichlet  boundary  value  problem  for  the  Poisson  equation  on  the  unit  square: 

-uxx  —  uyy  =  f(x,y),  u(x,  0)  =  0,  u(x,  1)  =  0,  a(0,y)  =  0,  u(l,y)  =  0,  0  <  x,  y  <  1. 

(a)  Write  both  u(x,y)  and  f(x,y)  as  Fourier  sine  series  in  y  whose  coefficients  depend  on  x. 

(b)  Substitute  these  series  into  the  differential  equation,  and  equate  Fourier  coefficients  to 
obtain  an  infinite  system  of  ordinary  boundary  value  problems  for  the  x-dependent  Fourier 
coefficients  of  u.  (c)  Use  the  Green’s  functions  for  each  boundary  value  problem  to  write 
out  the  solution  and  hence  a  series  for  the  solution  to  the  original  boundary  value  problem, 
(d)  Implement  this  method  for  the  following  forcing  functions: 

(*)  f(.xi  V)  =  sirnry,  (ii)  f(x,  y)  =  sin7rx  sin27ry,  (Hi)  f(x,y)  =  1. 

6.3.22.  Use  the  method  of  Exercise  6.3.21  to  find  a  series  representation  for  the  Green’s  function 
of  a  unit  square  subject  to  Dirichlet  boundary  conditions. 

6.3.23.  Write  out  the  details  of  how  to  derive  (6.134)  from  (6.133). 

6.3.24.  True  or  false:  If  the  gravitational  potential  at  a  point  a  is  greater  than  its  value  at  the 
point  b,  then  the  magnitude  of  the  gravitational  force  at  a  is  greater  than  its  value  at  b. 

6.3.25.  (a)  Write  down  integral  formulas  for  the  gravitational  potential  and  force  due  to  a  square 
plate  S  =  {  —  1  <  x,y  <  1}  of  unit  density  p  =  1.  (b)  Use  numerical  integration  to  calculate 

the  gravitational  force  at  the  points  (2,0)  and  (a/2  ,  V^)-  Before  starting,  try  to  predict 
which  point  experiences  the  stronger  force,  and  then  check  your  prediction. 

6.3.26.  An  equilateral  triangular  plate  with  unit  area  exerts  a  gravitational  force  on  an 
observer  sitting  a  unit  distance  away  from  its  center.  Is  the  force  greater  if  the  observer  is 
located  opposite  a  vertex  of  the  triangle  or  opposite  a  side?  Is  the  force  greater  than  or  less 
than  that  exerted  by  a  circular  plate  of  the  same  area?  Use  numerical  integration  to  evaluate 
the  double  integrals. 

6.3.27.  Consider  the  wave  equation  utt  =  c  uxx  on  the  line  —  oo  <  x  <  oo.  Use  the  d’Alembert 
formula  (2.82)  to  solve  the  initial  value  problem  u(0,x)  =  S(x  —  a),  ut(0,x )  =  0.  Can  you 
realize  your  solution  as  the  limit  of  classical  solutions? 

r\ 

0  6.3.28.  Consider  the  wave  equation  utt  =  c  uxx  on  the  line  —  oo  <  x  <  oo.  Use  the  d’Alembert 
formula  (2.82)  to  solve  the  initial  value  problem  u(0,x)  =  0,  ut(0,x)  =  S(x  —  a),  modeling 
the  effect  of  striking  the  string  with  a  highly  concentrated  blow  at  the  point  x  =  a.  Graph 
the  solution  at  several  times.  Discuss  the  behavior  of  any  discontinuities  in  the  solution.  In 
particular,  show  that  u(t ,  x)  ^  Q  on  the  domain  of  influence  of  the  point  (0,  a). 
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6.3.29.  (a)  Write  down  the  solution  u(t,x)  to  the  wave  equation  utt  =  4 uxx  on  the  real  line 


with  initial  data  u( 0,  x) 


1  - 

0, 


x 


X 


<  1, 


du 


(0,  x)  =  0.  (b)  Explain  why  u(t,x)  is 


otherwise,  dt 

not  a  classical  solution  to  the  wave  equation,  (c)  Determine  the  derivatives  d2u/dt 2  and 

d2u/dx2  in  the  sense  of  distributions  (generalized  functions)  and  use  this  to  justify  the  fact 
that  u(t,  x)  solves  the  wave  equation  in  a  distributional  sense. 

T  6.3.30.  A  piano  string  of  length  i  —  3  and  wave  speed  c  =  2  with  both  ends  fixed  is  hit  by  a 
hammer  g  of  the  way  along.  The  initial-boundary  value  problem  that  governs  the  resulting 
vibrations  of  the  string  is 


d2u 


d2u 


u(t ,  0)  =  0  =  u(t,  3),  a(0,  x)  =  0, 


du 

dt 


(0,  x)  =  S(x  —  1). 


dt 2  dx 2  5 

(a)  What  are  the  fundamental  frequencies  of  vibration? 

(b)  Write  down  the  solution  to  the  initial-boundary  value  problem  in  Fourier  series  form. 

(c)  Write  down  the  Fourier  series  for  the  velocity  du/dt  of  your  solution. 

(d)  Write  down  the  d’Alembert  formula  for  the  solution,  and  sketch  a  picture  of  the  string 
at  four  or  five  representative  times. 

(e)  True  or  false:  The  solution  is  periodic  in  time.  If  true,  what  is  the  period?  If  false, 
explain  what  happens  as  t  increases. 

6.3.31.  (a)  Write  down  a  Fourier  series  for  the  solution  to  the  initial-boundary  value  problem 

^  U  ^  U  u(t ,  —1)  =  0  =  u(t,  1),  a(0,  x)  =  S(x),  ^  (0,  x) 


0. 


dt2  dx2  ’  y  ~  n 

(b)  Write  down  an  analytic  formula  for  the  solution,  i.e.,  sum  your  series. 

(c)  In  what  sense  does  the  series  solution  in  part  (a)  converge  to  the  true  solution?  Do  the 
partial  sums  provide  a  good  approximation  to  the  actual  solution? 


6.3.32.  Answer  Exercise  6.3.31  for 
d2u  d2u 


dt2  dx2 


u(t,  —1)  =  0  =  u(t,  1),  a(0,  x)  =  0, 


du 

~dt 


(0,  x)  =  £(#). 


Chapter  7 

Fourier  Transforms 


Fourier  series  and  their  ilk  are  designed  to  solve  boundary  value  problems  on  bounded 
intervals.  The  extension  of  the  Fourier  calculus  to  the  entire  real  line  leads  naturally  to  the 
Fourier  transform ,  a  powerful  mathematical  tool  for  the  analysis  of  aperiodic  functions. 
The  Fourier  transform  is  of  fundamental  importance  in  a  remarkably  broad  range  of  ap¬ 
plications,  including  both  ordinary  and  partial  differential  equations,  probability,  quantum 
mechanics,  signal  and  image  processing,  and  control  theory,  to  name  but  a  few. 

In  this  chapter,  we  motivate  the  construction  by  investigating  how  (rescaled)  Fourier 
series  behave  as  the  length  of  the  interval  goes  to  infinity.  The  resulting  Fourier  transform 
maps  a  function  defined  on  physical  space  to  a  function  defined  on  the  space  of  frequencies, 
whose  values  quantify  the  “amount”  of  each  periodic  frequency  contained  in  the  original 
function.  The  inverse  Fourier  transform  then  reconstructs  the  original  function  from  its 
transformed  frequency  components.  The  integrals  defining  the  Fourier  transform  and  its 
inverse  are,  remarkably,  almost  identical,  and  this  symmetry  is  often  exploited,  for  example 
in  assembling  tables  of  Fourier  transforms. 

One  of  the  most  important  properties  of  the  Fourier  transform  is  that  it  converts 
calculus  —  differentiation  and  integration  —  into  algebra  —  multiplication  and  division. 
This  underlies  its  application  to  linear  ordinary  differential  equations  and,  in  the  following 
chapters,  partial  differential  equations.  In  engineering  applications,  the  Fourier  transform 
is  sometimes  overshadowed  by  the  Laplace  transform,  which  is  a  particular  subcase.  The 
Fourier  transform  is  used  to  analyze  boundary  value  problems  on  the  entire  line.  The 
Laplace  transform  is  better  suited  to  solving  initial  value  problems,  [23],  but  will  not  be 
developed  in  this  text. 

The  Fourier  transform  is,  like  Fourier  series,  completely  compatible  with  the  calculus 
of  generalized  functions,  [68].  The  final  section  contains  a  brief  introduction  to  the  analytic 
foundations  of  the  subject,  including  the  basics  of  Hilbert  space.  However,  a  full,  rigorous 
development  requires  more  powerful  analytical  tools,  including  the  Lebesgue  integral  and 
complex  analysis,  and  the  interested  reader  is  therefore  referred  to  more  advanced  texts, 
including  [37,  68,  98,  117  . 


7.1  The  Fourier  Transform 

We  begin  by  motivating  the  Fourier  transform  as  a  limiting  case  of  Fourier  series.  Although 
the  rigorous  details  are  subtle,  the  underlying  idea  can  be  straightforwardly  explained.  Let 
f(x)  be  a  function  defined  for  all  —  oo  <  x  <  oo.  The  goal  is  to  construct  a  Fourier  expan- 
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sion  for  f(x)  in  terms  of  basic  trigonometric  functions.  One  evident  approach  is  to  construct 
its  Fourier  series  on  progressively  longer  and  longer  intervals,  and  then  take  the  limit  as 
their  lengths  go  to  infinity.  This  limiting  process  converts  the  Fourier  sums  into  integrals, 
and  the  resulting  representation  of  a  function  is  renamed  the  Fourier  transform.  Since  we 
are  dealing  with  an  infinite  interval,  there  are  no  longer  any  periodicity  requirements  on 
the  function  f(x).  Moreover,  the  frequencies  represented  in  the  Fourier  transform  are  no 
longer  constrained  by  the  length  of  the  interval,  and  so  we  are  effectively  decomposing  a 
quite  general  aperiodic  function  into  a  continuous  superposition  of  trigonometric  functions 
of  all  possible  frequencies. 

Let  us  present  the  details  in  a  more  concrete  form.  The  computations  will  be  signif¬ 
icantly  simpler  if  we  work  with  the  complex  version  of  the  Fourier  series  from  the  outset. 
Our  starting  point  is  the  rescaled  Fourier  series  (3.86)  on  a  symmetric  interval  [  —  £,£]  of 
length  2  t,  which  we  rewrite  in  the  adapted  form 


/(*)  ~  E  (7.i) 

u——oo  ’ 

The  sum  is  over  the  discrete  collection  of  frequencies 

K  =  ,  v~  0,±1,±2, ...,  (7.2) 

corresponding  to  those  trigonometric  functions  that  have  period  2  £.  For  reasons  that  will 
soon  become  apparent,  the  Fourier  coefficients  of  /  are  now  denoted  as 


so  that 


(7.3) 

(7.4) 


This  reformulation  of  the  basic  Fourier  series  formula  allows  us  to  easily  pass  to  the  limit 
as  the  interval’s  length  £  oo. 

On  an  interval  of  length  2£,  the  frequencies  (7.2)  required  to  represent  a  function  in 
Fourier  series  form  are  equally  distributed,  with  interfrequency  spacing 


A k  —  kv+1  kv  —  ^  .  (7-5) 

As  £  -r  oo,  the  spacing  A k  -T  0,  and  so  the  relevant  frequencies  become  more  and  more 
densely  packed  in  the  line  —oo<k<oo.  In  the  limit,  we  thus  anticipate  that  all  possible 
frequencies  will  be  represented.  Indeed,  letting  kv  —  k  be  arbitrary  in  (7.4),  and  sending 
£  -T  oo,  results  in  the  infinite  integral 

m  =  7T,  L 

known  as  the  Fourier  transform  of  the  function  f(x).  If  f(x)  is  a  sufficiently  nice  function, 
e.g.,  piecewise  continuous  and  decaying  to  0  reasonably  quickly  as  |  x  |  -T  oo,  its  Fourier 
transform  f{k)  is  defined  for  all  possible  frequencies  k  £  M.  The  preceding  formula  will 
sometimes  conveniently  be  abbreviated  as 


f(x)e 


i  k  x 


dx. 


(7.6) 


f(k)  =  X[f(x)], 


(7.7) 
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where  T  is  the  Fourier  transform  operator ,  which  maps  each  (sufficiently  nice)  function  of 
the  spatial  variable  x  to  a  function  of  the  frequency  variable  k. 

To  reconstruct  the  function  from  its  Fourier  transform,  we  apply  a  similar  limiting 
procedure  to  the  Fourier  series  (7.1),  which  we  first  rewrite  in  a  more  suggestive  form, 


/  0)  ~ 


oo 

£  fe(ku)eik"xAk, 

V  —  —  OO 


using  (7.5).  For  each  fixed  value  of  x,  the  right-hand  side  has  the  form 
approximating  the  integral 


1 


fe(k)  eikx  dk. 


of  a  Riemann  sum 


As  £  — oo,  the  functions  (7.4)  converge  to  the  Fourier  transform:  f^(k)  — >>  /(fc);  moreover, 
the  interfrequency  spacing  A k  =  ^  -T  0,  and  so  one  expects  the  Riemann  sums  to 

converge  to  the  limiting  integral 


/  0)  ~ 


f(k)  e'kx  dk. 


The  resulting  formula  serves  to  define  the  inverse  Fourier  transform ,  which  is  used  to  re¬ 
cover  the  original  signal  from  its  Fourier  transform.  In  this  manner,  the  Fourier  series  has 
become  a  Fourier  integral  that  reconstructs  the  function  f(x)  as  a  (continuous)  superposi¬ 
tion  of  complex  exponentials  elkx  of  all  possible  frequencies,  with  f(k) /y/27r  quantifying 
the  amount  contributed  by  the  complex  exponential  of  frequency  k.  In  abbreviated  form, 
formula  (7.9)  can  be  written 

f(x)  =?-'[  f(k)],  (7.10) 


thus  defining  the  inverse  of  the  Fourier  transform  operator  (7.7). 

It  is  worth  pointing  out  that  both  the  Fourier  transform  (7.7)  and  its  inverse  (7.10) 
define  linear  operators  on  function  space.  This  means  that  the  Fourier  transform  of  the 
sum  of  two  functions  is  the  sum  of  their  individual  transforms,  while  multiplying  a  function 
by  a  constant  multiplies  its  Fourier  transform  by  the  same  factor: 


HHx)  +  g{x)}  =  r[f{x ;)]  +  F[g{x )]  =  f{k)  +  g(k), 

F[cf(x)]  =  cF[f(x)}  =  cf(k). 


A  similar  statement  holds  for  the  inverse  Fourier  transform  J-  1 . 

Recapitulating,  by  letting  the  length  of  the  interval  go  to  oo,  the  discrete  Fourier  series 
has  become  a  continuous  Fourier  integral,  while  the  Fourier  coefficients,  which  were  defined 
only  at  a  discrete  collection  of  possible  frequencies,  have  become  a  complete  function  f{k) 
defined  on  all  of  frequency  space.  The  reconstruction  of  f(x)  from  its  Fourier  transform 
f(k)  via  (7.9)  can  be  rigorously  justified  under  suitable  hypotheses.  For  example,  if  f(x) 
is  piecewise  C1  on  all  of  M  and  decays  reasonably  rapidly,  f(x)  -T  0  as  \  x\  -T  oo,  so 
that  its  Fourier  integral  (7.6)  converges  absolutely,  then  it  can  be  proved,  [37,  117],  that 
the  inverse  Fourier  integral  (7.9)  will  converge  to  f{x)  at  all  points  of  continuity,  and  to 
the  midpoint  |(/(x_)  +  /(x+))  at  jump  discontinuities  —  just  like  a  Fourier  series.  In 

particular,  its  Fourier  transform  f{k)  ^0  must  also  decay  as  \  k\  ^  oo,  implying  that  (as 
with  Fourier  series)  the  very  high  frequency  modes  make  negligible  contributions  to  the 
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Figure  7.1.  Fourier  transform  of  a  rectangular  pulse. 


reconstruction  of  such  a  signal.  A  more  precise  result  will  be  formulated  in  Theorem  7.15 
below. 


Example  7.1.  The  Fourier  transform  of  the  rectangular  puls 

1,  —  a  <  x  <  a. 


f{x)  =  <j(x  +  a)  —  <j(x  —  a)  = 
of  width  2  a,  is  easily  computed: 


0. 


x 


>  a. 


(7.12) 


f(k) 


1 


*a 


i  k  x 


V2 


dx  = 


g i ka  _  g— i ka 


i r 


a 


V27r  i k 


2  sin  ak 
7 r  k 


(7.13) 


On  the  other  hand,  the  reconstruction  of  the  pulse  via  the  inverse  transform  (7.9)  tells  us 
that 

1,  —a  <  x  <  a, 

1  f°°  exkx  sina/c  I  x 

dk  =  f{x)  =  <  ,  x  =  =b  a,  (7.14) 


7T 


—  oo 


k 


0. 


>  a. 


Note  the  convergence  to  the  middle  of  the  jump  discontinuities  at  x  =  =ba.  The  real  part 
of  this  complex  integral  produces  a  striking  trigonometric  integral  identity: 

1,  —a  <  x  <  a, 


1 

7T 


>oo 


■oo 


cos  xk  sin  ak 
k 


dk  = 


2  ’ 

0, 


x  =  =ba. 


(7.15) 


>  a. 


Just  as  many  Fourier  series  yield  nontrivial  summation  formulas,  the  reconstruction  of  a 
function  from  its  Fourier  transform  often  leads  to  nontrivial  integration  formulas.  One 


t 


cr(x)  is  the  unit  step  function  (3.46). 
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cannot  compute  the  integral  (7.14)  by  the  Fundamental  Theorem  of  Calculus,  since  there 
is  no  elementary  function  whose  derivative  equals  the  integrand. ^  In  Figure  7.1  we  display 
the  box  function  with  a  —  1,  its  Fourier  transform,  along  with  a  reconstruction  obtained 
by  numerically  integrating  (7.15).  Since  we  are  dealing  with  an  infinite  integral,  we  must 
break  off  the  numerical  integrator  by  restricting  it  to  a  finite  interval.  The  first  graph 
in  the  second  row  is  obtained  by  integrating  from  —  5  <  k  <  5,  while  the  second  is  from 
-10  <  k  <  10.  The  nonuniform  convergence  of  the  integral  leads  to  the  appearance 
of  a  Gibbs  phenomenon  at  the  two  discontinuities,  similar  to  what  we  observed  in  the 
nonuniform  convergence  of  a  Fourier  series. 

On  the  other  hand,  the  identity  resulting  from  the  imaginary  part, 


1 


7 r 


‘OO 


—  oo 


sin kx  sin ak 
k 


dk  =  0. 


is,  on  the  surface,  not  surprising,  because  the  integrand  is  odd.  However,  it  is  far  from 
obvious  that  either  integral  converges;  indeed,  the  amplitude  of  the  oscillatory  integrand 
decays  like  1/|  k  |,  but  the  latter  function  does  not  have  a  convergent  integral,  and  so  the 
usual  comparison  test  for  infinite  integrals,  [8,97],  fails  to  apply.  Their  convergence  is 
marginal  at  best,  and  the  trigonometric  oscillations  somehow  manage  to  ameliorate  the 
slow  rate  of  decay  of  1/k. 

Example  7.2.  Consider  an  exponentially  decaying  right-handed  pulse* 


ax 


fr(X )  = 


x  >  0, 

0,  x  <  0, 

where  a  >  0.  We  compute  its  Fourier  transform  directly  from  the  definition: 

p  —  (a-\-\k)x  00 


(7.16) 


“OO 


fr(k )  = 


V2 


—  ax  —  i  kx 


dx  —  — 


7T  JO 


\/2  7 r  cl  +  i  k 


x  =  0 


V2ii  (a  +  i k) 


As  in  the  preceding  example,  the  inverse  Fourier  transform  produces  a  nontrivial  integral 
identity: 


‘oo  gifcx 


2t r 


■oo 


a  +  i  k 


dk  = 


—  ax 

v 

1 

2  ’ 


0. 


x  >  0. 
x  =  0. 
x  <  0. 


(7.17) 


Similarly,  a  pulse  that  decays  to  the  left, 


/{(*) 


e 

0 


ax 


x  <  0, 
x  >  0, 


(7.18) 


where  a  >  0  is  still  positive,  has  Fourier  transform 

m-  1 


V2t r  (a  —  i k) 


(7.19) 


^  One  can  use  Euler’s  formula  (3.59)  to  reduce  (7.14)  to  a  complex  version  of  the  exponential 
integral  J  (eak /k)  dk ,  but  it  can  be  proved,  [25],  that  neither  integral  can  be  written  in  terms  of 
elementary  functions. 

^  Note  that  we  cannot  Fourier  transform  the  entire  exponential  function  e~ax ,  because  it  does 
not  go  to  zero  at  both  Too,  which  is  required  for  the  integral  (7.6)  to  converge. 
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Figure  7.2. 


Odd  pulse  fQ(x) 
Exponential  pulses. 


This  also  follows  from  the  general  fact  that  the  Fourier  transform  of  f(—x)  is  /(— fc);  see 
Exercise  7.E10.  The  even  exponentially  decaying  pulse 


fe  (*) 


a  x 


(7.20) 


is  merely  the  sum  of  left  and  right  pulses:  fe  =  fr  +  fi-  Thus,  by  linearity, 


fe(k )  =  fr(k)  +//(fc) 


+ 


a/27t  (a  +  i k)  (a  —  i k)  V  vr  k2  +  a2 


a 


(7.21) 


The  resulting  Fourier  transform  is  real  and  even  because  fe(x)  is  a  real- valued  even  func¬ 
tion;  see  Exercise  7.1.12.  The  inverse  Fourier  transform  (7.9)  produces  another  nontrivial 
integral  identity: 


—  a  |  x 


1 

7 r 


a  e 


i  kx 


k2  +  a2 


dk 


a 

7 T 


cos  k  x 
k2  +  a2 


dk. 


(7.22) 


(The  imaginary  part  of  the  integral  vanishes,  because  its  integrand  is  odd.)  On  the  other 
hand,  the  odd  exponentially  decaying  pulse, 


fo(x)  =  (signx)e  “1*1  = 

is  the  difference  of  the  right  and  left  pulses,  fQ 
odd  Fourier  transform 


x  >  0, 
x  <  0, 


(7.23) 


fr  —  /),  and  has  purely  imaginary  and 


f0(k)  =  fr(k)-fl(k)  = 


a/2t r  (a  +  i k)  V2tt  (a  —  i k) 


—  i 


7 r  k2  +  a2 


(7.24) 
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The  inverse  transform  is 


(signx)  e 


a  x 


oo  7  i  kx  1 

dk=  - 

71 


■oo 


k  sin  kx 


7i  7_00  k2  +  a2 
As  a  final  example,  consider  the  rational  function 

1 


-oo  +  a 2 


d/c. 


/(*) 


x2  +  a2  ’ 

Its  Fourier  transform  requires  integrating 

m  -  1 


where 


a  >  0. 


“oo  i  kx 


v2tt  7-00  x2  +  a‘ 


dx . 


(7.25) 


(7.26) 


(7.27) 


The  indefinite  integral  (anti- derivative)  does  not  appear  in  basic  integration  tables,  and,  in 
fact,  cannot  be  done  in  terms  of  elementary  functions.  However,  we  have  just  managed  to 
evaluate  this  particular  integral!  Look  at  (7.22).  If  we  change  x  to  k  and  k  to  —  x,  then  we 

exactly  recover  the  integral  (7.27)  up  to  a  factor  of  ay  2/n.  We  conclude  that  the  Fourier 
transform  of  (7.26)  is 


(7.28) 


This  last  example  is  indicative  of  an  important  general  fact.  The  reader  has  no  doubt 
already  noted  the  remarkable  similarity  between  the  Fourier  transform  (7.6)  and  its  inverse 
(7.9).  Indeed,  the  only  difference  is  that  the  former  has  a  minus  sign  in  the  exponential. 
This  implies  the  following  Symmetry  Principle  relating  the  direct  and  inverse  Fourier  trans¬ 
forms. 


Theorem  7.3.  If  the  Fourier  transform  of  the  function  /(x)  is  f(k),  then  the  Fourier 
transform  of  /(x)  is  f(—k). 


The  Symmetry  Principle  allows  us  to  reduce  the  tabulation  of  Fourier  transforms  by 
half.  For  instance,  referring  back  to  Example  7.1,  we  deduce  that  the  Fourier  transform  of 
the  function 

f(x)  = 


2  sin  ax 


7 r  x 


IS  ,  i 

f(k)  =  <j(—  k  P  cl)  —  <j{—  k  —  a)  =  o(k  +  a)  —  o(k  —  a)  = 


1 

2  ’ 


0, 


—  a  <  k  <  cl, 
k  =  ±a ,  (7.29) 

k  >  a. 


Note  that,  by  linearity,  we  can  divide  both  f(x)  and  f{k)  by  y2/7 r  to  deduce  the  Fourier 

r  r  sin  ax 
transform  of  - . 

x 


Warning :  Some  authors  omit  the  V2n  factor  in  the  definition  (7.6)  of  the  Fourier 
transform  f(k).  This  alternative  convention  does  have  a  slight  advantage  of  eliminating 

many  y2 n  factors  in  the  transformed  expressions.  However,  this  necessitates  an  extra 
such  factor  in  the  reconstruction  formula  (7.9),  which  is  achieved  by  replacing  v2t r  by 
2tt.  A  significant  disadvantage  is  that  the  resulting  formulas  for  the  Fourier  transform  and 
its  inverse  are  less  similar,  and  so  the  Symmetry  Principle  of  Theorem  7.3  requires  some 
modification.  (On  the  other  hand,  convolution  —  to  be  discussed  below  —  is  a  little  easier 
without  the  extra  factor.)  Yet  another,  more  recent,  convention  can  be  found  in  Exercise 
7.1.18.  When  consulting  any  particular  reference,  the  reader  always  needs  to  check  which 
version  of  the  Fourier  transform  is  being  used. 


270 


7  Fourier  Transforms 


All  of  the  functions  in  Example  7.2  required  a  >  0  for  the  Fourier  integrals  to  converge. 
The  functions  that  emerge  in  the  limit  as  a  goes  to  0  are  of  special  interest.  Let  us  start 
with  the  odd  exponential  pulse  (7.23).  When  a  -T  0,  the  function  fQ(x)  converges  to  the 
sign  function 


f(x)  =  signx  =  cr(x)  —  cr(—x) 


Tl,  x  0, 

—  1,  x  <  0. 


(7.30) 


Taking  the  limit  of  the  Fourier  transform  (7.24)  leads  to 


(7.31) 


The  nonintegrable  singularity  of  f(k)  at  k  =  0  is  indicative  of  the  fact  that  the  sign  function 
does  not  decay  as  |  x  |  — >  oo.  In  this  case,  neither  the  Fourier  transform  integral  nor  its 
inverse  are  well  defined  as  standard  (Riemann,  or  even  Lebesgue)  integrals.  Nevertheless,  it 
is  possible  to  rigorously  justify  these  results  within  the  framework  of  generalized  functions. 

More  interesting  are  the  even  pulse  functions  /e(x),  which,  in  the  limit  a  -T  0,  become 
the  constant  function 

fix)  =  1.  (7.32) 

The  limit  of  the  Fourier  transform  (7.21)  is 


lim 

cl  — y  0 


1 2  a 

V  ?r  k2  +  a2 


0, 

oo, 


k^  0, 

k  =  0. 


(7.33) 


This  limiting  behavior  should  remind  the  reader  of  our  construction  (6.10)  of  the  delta 
function  as  the  limit  of  the  functions 


5(x)  =  lim  — - - — —  =  lim  — — — - —  . 

oo  7T  (1  -\-n2X2)  0  7 T  [CL2  +  X2) 

Comparing  with  (7.33),  we  conclude  that  the  Fourier  transform  of  the  constant  function 
(7.32)  is  a  multiple  of  the  delta  function  in  the  frequency  variable: 


f(k)  =  ^/r27r5(k).  (7.34) 

The  direct  transform  integral 

ikxdx 

is,  strictly  speaking,  not  defined,  because  the  infinite  integrals  of  the  oscillatory  sine  and 
cosine  functions  don’t  converge!  However,  this  identity  can  be  validly  interpreted  within 
the  framework  of  weak  convergence  and  generalized  functions.  On  the  other  hand,  the 
inverse  transform  formula  (7.9)  yields 


(7.35) 


1 


5 


which  is  in  accord  with  the  basic  definition  (6.16)  of  the  delta  function.  As  in  the  preceding 
case,  the  delta  function  singularity  at  k  =  0  manifests  the  lack  of  decay  of  the  constant 
function. 
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Conversely,  the  delta  function  5(x)  has  constant  Fourier  transform 


(7.36) 


a  result  that  also  follows  from  the  Symmetry  Principle  of  Theorem  7.3.  To  determine  the 
Fourier  transform  of  a  delta  spike  5^(x)  =  5(x  —  £)  concentrated  at  position  x  =  £,  we 
compute 


£)e~ikx 


dx 


e-ikt 


(7.37) 


The  result  is  a  pure  exponential  in  frequency  space.  Applying  the  inverse  Fourier  transform 
(7.9)  leads,  at  least  on  a  formal  level,  to  the  remarkable  identity 


<5c(x)  =  5{x  -  Cj 


1 

2tt 


dk 


(7.38) 


where  ( ■ ,  • )  denotes  the  L2  Hermitian  inner  product  of  complex-valued  functions  of 
k  E  M.  Since  the  delta  function  vanishes  for  this  identity  is  telling  us  that  complex 

exponentials  of  differing  frequencies  are  mutually  orthogonal.  However,  as  with  (7.35), 
this  makes  sense  only  within  the  language  of  generalized  functions.  On  the  other  hand, 
multiplying  both  sides  of  (7.38)  by  /(£)  and  then  integrating  with  respect  to  produces 


(7.39) 


This  is  a  perfectly  valid  formula,  being  a  restatement  (or,  rather,  combination)  of  the 
basic  formulas  (7.6)  and  (7.9)  connecting  the  direct  and  inverse  Fourier  transforms  of  the 
function  f(x). 

Conversely,  the  Symmetry  Principle  tells  us  that  the  Fourier  transform  of  a  pure 
exponential  eXKX  will  be  a  shifted  delta  spike  S(k  —  ft),  concentrated  at  frequency 
k  =  ft.  Both  results  are  particular  cases  of  the  following  Shift  Theorem ,  whose  proof  is  left 
as  an  exercise  for  the  reader. 


Theorem  7.4.  If  f(x)  has  Fourier  transform  f{k),  then  the  Fourier  transform  of  the 
shifted  function  f{x  —  £)  is  e~lk ^  f(k).  Similarly ,  the  transform  of  the  product  function 
elKX  f(x),  for  real  ft,  is  the  shifted  transform  f(k  —  ft). 


In  a  similar  vein,  the  Dilation  Theorem  gives  the  effect  of  a  scaling  transformation  on 
the  Fourier  transform.  Again,  the  proof  is  left  to  the  reader. 


Theorem  7.5.  If  f(x)  has  Fourier  transform  f(k),  then  the  Fourier  transform  of  the 
rescaled  function  f(cx )  for  0  ^  c  E  M  is  — -  f  (  — 


c 


c 
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Concise  Table  of  Fourier  Transforms 


Note :  The  parameters  a,  c,  d  are  real,  with  a  >  0  and  0. 
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Example  7.6.  Let  us  determine  the  Fourier  transform  of  the  Gaussian  function 
g(x)  —  e~x  .  To  evaluate  its  Fourier  integral,  we  first  complete  the  square  in  the  exponent: 


g(k)  = 


V2 


7T 


■oo 


-oo 


~x2~ikxdx=  1 


‘OO 


e~(x+ik/2)2-k2 /4 


V2 

e~kz/A  poo 


7T  j  — oo 

,2 


e~y2  dy=  - 


k2  /  A 


V2tt  J- oo  *  V2 

The  next-to-last  equality  employed  the  change  of  variables^  y  =  x  +  |  i  fc,  while  the  final 
step  used  formula  (2.100). 

More  generally,  to  find  the  Fourier  transform  of  ga{x)  —  e~ax  ,  where  a  >  0,  we  invoke 
the  Dilation  Theorem  7.5  with  c  =  y/a  to  deduce  that  ga{k)  =  e~  k<2 / \/2a. 


Since  the  Fourier  transform  uniquely  associates  a  function  f(k)  on  frequency  space 
with  each  (reasonable)  function  f(x)  on  physical  space,  one  can  characterize  functions  by 
their  transforms.  Many  practical  applications  rely  on  tables  (or,  even  better,  computer 
algebra  systems  such  as  Mathematica  and  Maple)  that  recognize  a  wide  variety  of 
transforms  of  basic  functions  of  importance  in  applications.  The  accompanying  table  lists 
some  of  the  most  important  examples  of  functions  and  their  Fourier  transforms,  based 
on  our  convention  (7.6).  Keep  in  mind  that,  by  applying  the  Symmetry  Principle  of 
Theorem  7.3,  each  entry  can  be  used  to  deduce  two  different  Fourier  transforms.  A  more 
extensive  collection  of  Fourier  transforms  can  be  found  in  [82 


Exercises 


7.1.1.  Find  the  Fourier  transform  of  the  following  functions: 


(a)  e 


(e) 


-  (x+4)' 


—  x 


(b)  e 


—  I  X+l 


(c) 


X,  I  X  I  <  1, 

0,  otherwise, 


2  x 


(d) 


-l 


x 

X 


>  1, 
<  1, 


(f) 


—  X 


e  sinx,  x  >  0, 


0, 


x  <  0, 


(g) 


3  x 


X 


x  >  0, 
x  <  0, 

X  |  <  1, 

otherwise. 


7.1.2.  Find  the  Inverse  Fourier  transform  of  the  following  functions:  (a)  e  k  ,  (b)  e 

k  7_  7.  ^  ^  r  1,  a  <  k  <  /3,  ,  x  f  1  —  |  k  |,  |  k  |  <  1, 

0,  otherwise,  v  ^  1  0,  otherwise. 


k 


(c) 


e 

0. 


sin  /c,  k  >  0, 
k  <  0, 


(d) 


7.1.3.  Find  the  inverse  Fourier  transform  of  the  function  1  / (k  +  c)  when  (a)  c  =  a  is  real; 
(b)  c  =  ib  is  purely  imaginary;  (c)  c  =  a  +  ib  is  an  arbitrary  complex  number. 

7.1.4.  Find  the  inverse  Fourier  transform  of  1  / (k2  —  a2),  where  a  >  0  is  real. 

Hint:  Use  Exercise  7.1.3. 

0  7.1.5.  (a)  Find  the  Fourier  transform  of  eluJX .  (b)  Use  this  to  find  the  Fourier  transforms  of 

the  basic  trigonometric  functions  cos  ujx  and  sincax. 

7.1.6.  Write  down  two  real  integral  identites  that  result  from  the  inverse  Fourier  transform 
of  (7.28). 


^  Since  this  represents  a  complex  change  of  variables,  a  fully  rigorous  justification  of  this  step 
requires  the  use  of  complex  integration. 
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7.1.7.  Write  down  two  real  integral  identities  that  follow  from  (7.17). 

7.1.8.  (a)  Find  the  Fourier  transform  of  the  hat  function  fn(x)  = 

(b)  What  is  the  limit,  as  tl  — y  oo,  of  ?n(fc)? 

(c)  In  what  sense  is  the  limit  the  Fourier  transform  of  the  limit  of  /  (x)? 

7.1.9.  (a)  Justify  the  linearity  of  the  Fourier  transform,  as  in  (7.11). 

(b)  State  and  justify  the  linearity  of  the  inverse  Fourier  transform. 

0  7.1.10.  If  the  Fourier  transform  of  f(x)  is  f(k),  prove  that  (a)  the  Fourier  transform  of  /(— x) 
is  /(—  k);  (b)  the  Fourier  transform  of  the  complex  conjugate  function  f(x)  is  f(—k). 

7.1.11.  True  or  false:  If  the  complex-valued  function  f(x)  =  g(x)-\-  i  h(x)  has  Fourier  transform 
f(k)  =  g(k)-\-ih(k),  then  g(x)  has  Fourier  transform  g(k)  and  h(x)  has  Fourier  transform  h(k). 

7.1.12.  (a)  Prove  that  the  Fourier  transform  of  an  even  function  is  even,  (b)  Prove  that  the 
Fourier  transform  of  a  real  even  function  is  real  and  even,  (c)  What  can  you  say  about  the 
Fourier  transform  of  an  odd  function?  (d)  Of  a  real  odd  function?  (e)  What  about  a 
general  real  function? 


x  |  <  1/n, 
otherwise. 


^  7.1.13.  Prove  the  Shift  Theorem  7.4. 

0  7.1.14.  Prove  the  Dilation  Theorem  7.5. 

7.1.15.  Given  that  the  Fourier  transform  of  f(x)  is  f(k),  find,  from  first  principles,  the  Fourier 
transform  of  g(x)  =  f(ax  +  6),  where  a  and  b  are  fixed  real  constants. 

7.1.16.  Let  a  be  a  real  constant.  Given  the  Fourier  transform  f(k)  of  /(x),  find  the  Fourier 
transforms  of  (a)  f(x)eiax ,  (b)  /(x)  cosax,  (c)  /(x)sinax. 


0  7.1.17.  A  common  alternative  convention  for  the  Fourier  transform  is  to  define 

fi(k)  =  /  f(x)  e~  'kx  dx. 

J  —  OO 

(a)  What  is  the  formula  for  the  corresponding  inverse  Fourier  transform? 

(b)  How  is  fi(k)  related  to  our  Fourier  transform  f(k)l 


7.1.18.  Another  convention  for  the  Fourier  transform  is  to  define  /2(/c)  = 


■oo 


—  oo 


\  -27rifcx  j 

f(x)e  ax. 


Answer  the  questions  in  Exercise  7.1.17  for  this  version  of  the  Fourier  transform. 


T  7.1.19.  The  cosine  and  sine  transforms  of  a  real  function  f(x)  are  defined  as 

r OO  r OO 

c(k)  =  /  /(x)  cos  k  x  dx:  s(k)  =  /  f(x)  sin  kx  dx. 


■oo 


■oo 


(7.40) 


(i)  Prove  that  f(k)  =  c{k)  —  i  s(k).  (ii)  Find  the  cosine  and  sine  transforms  of  the  func¬ 
tions  in  Exercise  7.1.1.  (in)  Show  that  c(k)  is  an  even  function,  while  s(k)  is  an  odd  func¬ 
tion.  (iv)  Show  that  if  /  is  an  even  function,  then  s(k)  =  0,  while  if  /  is  an  odd  function, 
then  c(k)  =  0. 


^  7.1.20.  The  two-dimensional  Fourier  transform  of  a  function  f(x,y )  defined  for  (x,  y)  G  M2  is 


/(M)  = 


l 


‘OO 


■oo 


f(x,y)e  l(ykxJrly^  dx  dy. 


2  7T  J  —  oo  J  —  oo 

(a)  Compute  the  Fourier  transform  of  the  following  functions: 


(7.41) 


(*)  e 

(iv) 


X 


1. 


y 


x 


(ii)  e 

V  I  <  1, 


0,  otherwise, 


2  2 
x  -y 


0) 


(Hi)  the  delta  function  S(x  —  £)  S(y  —  77), 

;  1 

(vi)  cos (x  —  y). 


i lj 

1  X  |  + 

y 

1  0, 

otherwise 

(b)  Show  that  if  /(x,  y)  =  g(x)  h(y ),  then  f(k,  l)  =  g(k)  h(l). 

(c)  What  is  the  formula  for  the  inverse  two-dimensional  Fourier  transform,  i.e.,  how  can  you 
reconstruct  f(x,y)  from  f(k,l)7 
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One  of  the  most  significant  features  of  the  Fourier  transform  is  that  it  converts  calculus 
into  algebra!  More  specifically,  the  two  basic  operations  in  calculus  —  differentiation  and 
integration  of  functions  —  are  realized  as  algebraic  operations  on  their  Fourier  transforms. 
(The  downside  is  that  algebraic  operations  become  more  complicated  in  the  frequency 
domain.) 

Differentiation 


Let  us  begin  with  derivatives.  If  we  differentiate^  the  basic  inverse  Fourier  transform 
formula 


/  O)  ~ 


f(k)  eikx  dk 


with  respect  to  x,  we  obtain 


f'(x)  ~ 


ik  f(k)  elkx  dk. 


(7.42) 


The  resulting  integral  is  itself  in  the  form  of  an  inverse  Fourier  transform,  namely  of  i  k  /(/c), 
which  immediately  implies  the  following  key  result. 


Proposition  7.7.  The  Fourier  transform  of  the  derivative  f'[x )  of  a  function  is 
obtained  by  multiplication  of  its  Fourier  transform  by  ik: 


r[f'(x)]  =  ikf(k). 


(7.43) 


Similarly  the  Fourier  transform  of  the  product  function  x  f(x)  is  obtained  by  differentiating 
the  Fourier  transform  of  f(x): 


r [xf(x )] 


.  <y_ 

dk 


(7.44) 


The  second  statement  follows  easily  from  the  first  via  the  Symmetry  Principle  of 
Theorem  7.3.  While  the  result  is  stated  for  ordinary  functions,  as  noted  earlier,  the  Fourier 
transform  — just  like  Fourier  series  —  is  entirely  compatible  with  the  calculus  of  generalized 
functions. 

Example  7.8.  The  derivative  of  the  even  exponential  pulse  fe{x)  =  e_alxl  is  a 
multiple  of  the  odd  exponential  pulse  fQ(x)  =  (signx)  e~a^x^\ 

fe(x)  =  —  a  (sign  a; )  e~alxl  =  -af0(x). 

Proposition  7.7  says  that  their  Fourier  transforms  are  related  by 

ikfe(k)  =  =  -FoW, 


^  We  are  assuming  that  the  integrand  is  sufficiently  nice  in  order  to  bring  the  derivative  under 
the  integral  sign;  see  [37, 117]  for  a  fully  rigorous  justification. 
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as  previously  noted  in  (7.21,  24).  On  the  other  hand,  the  odd  exponential  pulse  has  a  jump 
discontinuity  of  magnitude  2  at  x  =  0,  and  so  its  derivative  contains  a  delta  function: 

fo(x)  =  —  ae~a\x\  +  2  5(x)  =  —  afe(x )  +  2  5(x). 

This  is  reflected  in  the  relation  between  their  Fourier  transforms.  If  we  multiply  (7.24)  by 
i  fc,  we  obtain 


Higher-order  derivatives  are  handled  by  iterating  the  first-order  formula  (7.43). 
Corollary  7.9.  The  Fourier  transform  of  f  ^  ( x )  is  ( i  k)n  f{k ). 


This  result  has  an  important  consequence:  the  smoothness  of  the  function  f(x)  is 
manifested  in  the  rate  of  decay  of  its  Fourier  transform  f(k).  We  already  noted  that  the 
Fourier  transform  of  a  (nice)  function  must  decay  to  zero  at  large  frequencies:  f{k)  -T  0 
as  |  k  |  -r  oo.  (This  result  can  be  viewed  as  the  Fourier  transform  version  of  the  Riemann- 
Lebesgue  Lemma  3.46.)  If  the  nth  derivative  f(n\x)  is  also  a  reasonable  function,  then  its 

Fourier  transform  /(n)(/c)  =  (i  k)n  f(k)  must  go  to  zero  as  \k  \  -T  oo.  This  requires  that 
f{k)  go  to  zero  more  rapidly  than  |  k  \  ~n .  Thus,  the  smoother  /(x),  the  more  rapid  the 
decay  of  its  Fourier  transform.  As  a  general  rule  of  thumb,  local  features  of  /(x),  such  as 
smoothness,  are  manifested  by  global  features  of  /(fc),  such  as  the  rate  of  decay  for  large 
\k\.  The  Symmetry  Principle  implies  that  the  reverse  is  also  true:  global  features  of  /(x) 
correspond  to  local  features  of  f(k).  For  instance,  the  degree  of  smoothness  of  f(k)  governs 
the  rate  of  decay  of  /(x)  as  x  -T  Too.  This  local-global  duality  is  one  of  the  major  themes 
of  Fourier  theory. 


Integration 


Integration  is  the  inverse  operation  to  differentiation,  and  so  should  correspond  to  division 
by  ik  in  frequency  space.  As  with  Fourier  series,  this  is  not  completely  correct;  there  is 
an  extra  constant  involved,  which  contributes  an  additional  delta  function. 

Proposition  7.10.  If  /(x)  has  Fourier  transform  f(k),  then  the  Fourier  transform 

/X 

f(y )  dy  is 

-oo 

9(0  =  -  +  nf(0)S(k).  (7.45) 

Proof :  First  notice  that 


lim 

x  — >  —  oo 


g(x)  =  0, 


lim 

x  — >  +00 


POO 

'  f(x)dx  =  v^m. 

—  oo 


Therefore,  if  we  subtract  a  suitable  multiple  of  the  step  function  from  the  integral,  the 
resulting  function 
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decays  to  0  at  both  Too.  Consulting  our  table  of  Fourier  transforms,  we  find 


h(k)  =  g(k )  -  7T  /(0)  5(h)  +  -  /( 0) . 


(7.46) 


On  the  other  hand. 


Since  h(x)  0  as 
that 


x 


hf  (x)  =  f(x)  —  \/27r  /( 0)  5(x). 

— oo,  we  can  apply  our  differentiation  rule  (7.43),  and  conclude 


i kh(k)  =  f(k)  -  /( 0). 

Combining  (7.46)  and  (7.47)  establishes  the  desired  formula  (7.45). 

Example  7.11.  The  Fourier  transform  of  the  inverse  tangent  function 


fix)  =  tan  1  x  = 

Jo 


X 


dy 


* X 


dy 


0  1  +  r  J-oo^  +  y2 

can  be  computed  by  combining  Proposition  7.10  with  (7.28,  34): 


7T 

2 


'<*>  =  i  -  Wi  4 


k 


7T^/^  \  7 /  77  g 


k 


(7.47) 

Q.E.D. 


The  singularity  at  k  =  0  reflects  the  lack  of  decay  of  the  inverse  tangent  as 


x 


— )►  oo, 


Exercises 


7.2.1.  Determine  the  Fourier  transform  of  the  following  functions: 

(a)  e~x  /2,  (b)  xe~x  /2,  (c)  x2  e~x  ^2,  (d)  x,  (e)  xe 


7.2.2.  Find  the  Fourier  transform  of  (a)  the  error  function  erf  x  = 


—  2  I  x 


"i 

(f)  x  tan-  x. 


7 r  j  o 


X  _  ^2 

e  dz; 


(b)  the  complementary  error  function  erfcx  = 


•OO  _^2 

e  dz. 


7 r  j  x 


7.2.3.  Find  the  inverse  Fourier  transform  of  the  following  functions: 


(a)  k,  (b)  fee 


(c) 


/c 


(1  +  k 2)2  ’ 


(^) 


k 


(e) 


1 


k2  —  k 


7.2.4.  Is  the  usual  formula  (/(x)  =  d(x)  relating  the  step  and  delta  functions  compatible  with 
their  Fourier  transforms?  Justify  your  answer. 

7.2.5.  Find  the  Fourier  transform  of  the  derivative  S'(x)  of  the  delta  function  in  three  ways: 

(a)  First,  directly  from  the  definition  of  <T(x);  (b)  second,  using  the  formula  for  the  Fourier 
transform  of  the  derivative  of  a  function;  (c)  third,  as  a  limit  of  the  Fourier  transforms  of 
the  derivatives  of  the  functions  in  Exercise  7.1.8.  (d)  Are  your  answers  all  the  same?  If 
not,  can  you  explain  any  discrepancies? 

7.2.6.  Show  that  one  can  obtain  the  Fourier  transform  of  the  Gaussian  function  f(x)  —  e~x  /2 
by  the  following  trick.  First,  prove  that  f' (k)  =  —kf(k).  Use  this  to  deduce  that  f(k)  = 

j  2  /  2 

ce~  ’  for  some  constant  c.  Finally,  use  the  Symmetry  Principle  to  determine  c. 

7.2.7.  If  f(x)  has  Fourier  transform  f(k),  which  function  has  Fourier  transform 


k 


? 
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^  7.2.8.  If  f(x)  has  Fourier  transform  /(&),  what  is  the  Fourier  transform  of 


f(x) 


7.2.9.  Use  Exercise  7.2.8  to  find  the  Fourier  transform  of 

(a)  1/x,  (b)  (c)  x~1e~x  ,  (d)  (x3+4x)_1. 


7.2.10.  Directly  justify  formula  (7.43)  by  integrating  the  relevant  Fourier  transform  integral  by 
parts.  What  do  you  need  to  assume  about  the  behavior  of  f(x)  for  large  \x\l 


7.2.11.  Given  the  Fourier  transform  f(k)  of  f(x),  find  the  Fourier  transform  of  its  integral 

rX 

g(x)  —  /  f(y)  dy  starting  at  the  point  a  G  R. 

J  a 

§  7.2.12.  (a)  Explain  why  the  Fourier  transform  of  a  27r-periodic  function  f(x)  is  a  linear  combi- 

^  o° 

nation  of  delta  functions,  f(k)  =  E  cnS(k  —  n),  where  cn  are  the  (complex)  Fourier 

n  =  —  oo 

series  coefficients  (3.65)  of  f(x)  on  [  —  tt,  tt  ] . 

(b)  Find  the  Fourier  transform  of  the  following  periodic  functions: 

(i)  sin2x,  (ii)  cos  x,  (Hi)  the  27r-periodic  extension  of  f(x)  =  x, 

(iv)  the  sawtooth  function  h(x)  =  x  mod  1,  i.e.,  the  fractional  part  of  x. 


7.2.13.  Determine  the  Fourier  transforms  of  (a)  cosx  —  1,  (b) 
Hint :  Use  Exercises  7.2.8  and  7.2.12. 


cos  x  —  1 
x 


cosx  —  1 

X 


^  7.2.14.  Write  down  the  formulas  for  differentiation  and  integration  for  the  alternative  Fourier 
transforms  of  Exercises  7.1.17  and  7.1.18. 

7.2.15.  (a)  What  is  the  two-dimensional  Fourier  transform,  (7.41),  of  the  gradient  Vf(x,y)  of  a 
function  of  two  variables? 

_  2  _  2 

(b)  Use  your  formula  to  find  the  Fourier  transform  of  the  gradient  of  f(x,y)  =  e  x  y  . 


7.3  Green’s  Functions  and  Convolution 

The  fact  that  the  Fourier  transform  converts  differentiation  in  the  physical  domain  into 
multiplication  in  the  frequency  domain  is  one  of  its  most  compelling  features.  A  particularly 
important  consequence  is  that  it  effectively  transforms  differential  equations  into  algebraic 
equations,  and  thereby  facilitates  their  solution  by  elementary  algebra.  One  begins  by  ap¬ 
plying  the  Fourier  transform  to  both  sides  of  the  differential  equation  under  consideration. 
Solving  the  resulting  algebraic  equation  will  produce  a  formula  for  the  Fourier  transform  of 
the  desired  solution,  which  can  then  be  immediately  reconstructed  via  the  inverse  Fourier 
transform.  In  the  following  chapter,  we  will  use  these  techniques  to  solve  partial  differential 
equations. 


Solution  of  Boundary  Value  Problems 


The  Fourier  transform  is  particularly  well  adapted  to  boundary  value  problems  on  the 
entire  real  line.  In  place  of  the  boundary  conditions  used  on  finite  intervals,  we  look  for 
solutions  that  decay  to  zero  sufficiently  rapidly  as  |  x  \  — oo  —  in  order  that  their  Fourier 
transform  be  well  defined  (in  the  context  of  ordinary  functions).  In  quantum  mechanics, 
[66,  72  ,  these  solutions  are  known  as  the  bound  states ,  and  they  correspond  to  subatomic 
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particles  that  are  trapped  or  localized  in  a  region  of  space.  For  example,  the  electrons  in 
an  atom  are  bound  states  localized  by  the  electrostatic  attraction  of  the  nucleus. 

As  a  specific  example,  consider  the  boundary  value  problem 


+  UJ2  U  =  /l(x), 


—  OO  <  X  <  oo, 


(7.48) 


where  cc  >  0  is  a  positive  constant.  The  boundary  conditions  require  that  the  solution 
decay:  u(x)  — 0,  as  |  x  |  oo.  We  will  solve  this  problem  by  applying  the  Fourier 
transform  to  both  sides  of  the  differential  equation.  Taking  Corollary  7.9  into  account,  the 
result  is  the  linear  algebraic  equation 

k 2  u(k)  +  uj2  u(k )  =  h{k) 


relating  the  Fourier  transforms  of  u  and  h.  Unlike  the  differential  equation,  the  transformed 
equation  can  be  immediately  solved  for 


u(k)  = 


h(k) 


k2  + 


uj 


2  ' 


(7.49) 


Therefore,  we  can  reconstruct  the  solution  by  applying  the  inverse  Fourier  transform  for¬ 
mula  (7.9): 


u(x) 


1 


“OO 


h{k)  e 


i  kx 


k2  +  uj2 


dk. 


\/  2  7 r  j  —  oo 

For  example,  if  the  forcing  function  is  an  even  exponential  pulse. 


(7.50) 


with 


m  =  \- 


7T  k2  +  1 


then  (7.50)  writes  the  solution  as  a  Fourier  integral: 


g  i  kx 

(k2  +  (jj2){k 2  +  1) 


cos  kx 

_  rj  Zp 

(fc2+CJ2)(fc2  +  1)  ’ 


where  we  note  that  the  imaginary  part  of  the  complex  integral  vanishes  because  the  inte¬ 
grand  is  an  odd  function.  (Indeed,  if  the  forcing  function  is  real,  the  solution  must  also  be 
real.)  The  Fourier  integral  can  be  explicitly  evaluated  using  partial  fractions  to  rewrite 


«(*o  =  \  - 


7r  (k2  +  cj2)(k2  +  1)  V  tt  uj2  —  1  y  k2  +  1 


k2  +  cT 


uj2  7^  1. 


Thus,  according  to  our  table  of  Fourier  transforms,  the  solution  to  this  boundary  value 
problem  is 


X 


u(x) 


1  . 
- e 

CJ 


UJ  X 


UJ2  —  1 


when 


( 'jJ  2  7^  1 


(7.51) 


The  reader  may  wish  to  verify  that  this  function  is  indeed  a  solution,  meaning  that  it  is 
twice  continuously  differentiable  (which  is  not  so  immediately  apparent  from  the  formula) , 
decays  to  0  as  |  x  \  — oo,  and  satisfies  the  differential  equation  everywhere.  The  “resonant” 
case  uj2  =  1  is  left  to  Exercise  7.3.6. 


Remark :  The  method  of  partial  fractions  that  you  may  have  learned  in  first-year 
calculus  is  often  an  effective  tool  for  evaluating  (inverse)  Fourier  transforms  of  such  rational 
functions. 
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A  particularly  important  case  is  that  in  which  the  forcing  function 


h(x)  =  Sg(x)  =  5{x  —  £) 

represents  a  unit  impulse  concentrated  at  x  =  £.  The  resulting  solution  is  the  Green’s 
function  G(x;  £)  for  the  boundary  value  problem.  According  to  (7.49),  its  Fourier  transform 
with  respect  to  x  is 

1  e~ikZ 

k2  +  id2  ’ 

which  is  the  product  of  an  exponential  factor  e~lk^1  representing  the  Fourier  transform  of 
^(x),  times  a  multiple  of  the  Fourier  transform  of  the  even  exponential  pulse  e~u  I  x  L  We 
apply  the  Shift  Theorem  7.4,  and  conclude  that  the  Green’s  function  for  this  boundary 
value  problem  is  an  exponential  pulse  centered  at  £,  namely 

G(x;£)  =  2_  e-wl*-SI  =  g(x  -  £),  where  g(x)  =  G(x;  0)  =  2-  I  *  I.  (7.52) 

Z  id  Z  id 

Observe  that,  as  with  other  self-adjoint  boundary  value  problems,  the  Green’s  function 
is  symmetric  under  interchange  of  x  and  £,  so  G(x;£)  =  £■?(£;#).  As  a  function  of  x,  it 
satisfies  the  homogeneous  differential  equation  —  u"  +  ce2  u  =  0,  except  at  the  point  x  =  £, 
where  its  derivative  has  a  jump  discontinuity  of  unit  magnitude.  It  also  decays  as  |  x  \  — oo, 
as  required  by  the  boundary  conditions.  The  fact  that  G[x ;^)  =  g(x  —  £)  depends  only 
on  the  difference  x  —  ^  is  a  consequence  of  the  translation  invariance  of  the  boundary 
value  problem.  The  superposition  principle  based  on  the  Green’s  function  tells  us  that  the 
solution  to  the  inhomogeneous  boundary  value  problem  (7.48)  under  a  general  forcing  can 
be  represented  in  the  integral  form 


/OO  p  OO  -j  p  oo 

G(x;£)h(£,)d£=  g(x  -  £)  h(£)  d£  =  —  /  e-“\*-Z\  h(£)d£.  (7.53) 

-OO  J—  OO  J  —  OO 

The  reader  may  enjoy  recovering  the  particular  exponential  solution  (7.51)  from  this  inte¬ 
gral  formula. 


Exercises 


7.3.1.  Use  partial  fractions  to  compute  the  inverse  Fourier  transform  of  the  following  rational 
functions.  Hint :  First  solve  Exercise  7.1.3. 

„  i  k 


(a) 


1 


(b) 


(c) 


1 


(d) 


sin  2  k 


/c4  —  1  ’  k2  -\- 2k  —  3 

1 


k2  —  5  /c  —  6  k2  —  1  ’ 

7.3.2.  Find  the  inverse  Fourier  transform  of  the  function  0 

k2  +  2/c  +  5 

(a)  using  partial  fractions;  (b)  by  completing  the  square.  Are  your  answers  the  same? 

7.3.3.  Use  partial  fractions  to  compute  the  Fourier  transform  of  the  following  functions: 


(a) 


1 


x2  —  x  —  2 


(b) 


1 


X6  +  X 


(c) 


cosx 
x2  —  9 


d2u 


7.3.4.  Find  a  solution  to  the  differential  equation  —  +4 u  —  8{x)  by  using  the  Fourier 

transform.  x 
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7.3.5.  Use  the  Fourier  transform  to  solve  the  boundary  value  problem 

—  u"  +  u  =  S'(x  —  1)  for  —  oo  <  x  <  oo,  with  u(x)  — ^  0  as  x  — d=oo. 

0  7.3.6.  (a)  Use  the  Fourier  transform  to  solve  (7.48)  with  h(x)  =  e~  ^  x  I  when  uo  —  1. 

(b)  Verify  that  your  solution  can  be  obtained  as  a  limit  of  (7.51)  as  c o  1. 

7.3.7.  Use  the  Fourier  transform  to  find  a  bounded  solution  to  the  differential  equation 

////  ,  —2  I  x  I 

u  +  u  =  e  1  1 . 


7.3.8.  Use  the  Fourier  transform  to  find  an  integral  formula  for  a  bounded  solution  to  the 
Airy  differential  equation  — 


d2u 


dx2 


=  xu. 


0  7.3.9.  Prove  that  (7.51)  is  a  twice  continuously  differentiable  function  of  x  and  satisfies  the 
differential  equation  (7.48). 


Convolution 


In  our  solution  to  the  boundary  value  problem  (7.48),  we  ended  up  deriving  a  formula  for 
its  Fourier  transform  (7.49)  as  the  product  of  two  known  Fourier  transforms.  The  final 
Green’s  function  formula  (7.53),  obtained  by  applying  the  inverse  Fourier  transform,  is 
indicative  of  a  general  property,  in  that  it  is  given  by  a  convolution  product. 

Definition  7.12.  The  convolution  of  scalar  functions  f{x)  and  g(x)  is  the  scalar 
function  h  =  f  *  g  defined  by  the  formula 


/oo 

-oo 


(7.54) 


We  list  the  basic  properties  of  the  convolution  product,  leaving  their  verification  as 
exercises  for  the  reader.  All  of  these  assume  that  the  implied  convolution  integrals  converge. 


(a)  Symmetry : 

(b)  Bilinearity'. 

(c)  Associativity: 

(d)  Zero  function: 

(e)  Delta  function: 


f*g  =  g*f, 

f  *(ag  +  bh )  =  a(f  *g)  +  b(f  *  h ), 

( af  +  bg )  *  h  =  a(f  *  h)  +  b(g  *  b), 
f  *(g*h)  =  (f  *g)  *h, 

f*  o  =  o, 

f*S  =  f. 


a,  b  E  C, 


One  tricky  feature  is  that  the  constant  function  1  is  not  a  unit  for  the  convolution 
product;  indeed, 

/oo 

m 

-oo 

is  a  constant  function,  namely  the  total  integral  of  /,  and  not  the  original  function  f(x).  In 
fact,  according  to  the  final  property,  the  delta  function  plays  the  role  of  the  “convolution 
unit” : 

/oo 

f(x  -  £,)5(£)d£,  =  f(x). 

-oo 
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In  particular,  our  solution  (7.52)  has  the  form  of  a  convolution  product  between  an 
even  exponential  pulse  g{x)  =  (2cc)_1  e~^x  I  and  the  forcing  function: 


u(x)  =  g  *  h(x). 

On  the  other  hand,  its  Fourier  transform  (7.49)  is,  up  to  a  factor,  the  ordinary  multiplicative 
product 

u(k)  =  r  'g(k)  h(k ) 

of  the  Fourier  transforms  of  g  and  h.  In  fact,  this  is  a  general  property  of  the  Fourier  trans¬ 
form:  convolution  in  the  physical  domain  corresponds  to  multiplication  in  the  frequency 
domain,  and  conversely. 

Theorem  7.13.  The  Fourier  transform  of  the  convolution  h(x)  =  /  *  g(x)  of  two 
functions  is  a  multiple  of  the  product  of  their  Fourier  transforms : 


h(k)  =  v2t r  f(k)fj(k). 


(7.55) 


Conversely ,  the  Fourier  transform  of  their  product  h(x)  =  f(x)g(x)  is,  up  to  a  multiple, 
the  convolution  of  their  Fourier  transforms : 


h(k) 


1 


V2 


7 r 


/  *  g(k) 


l 


>oo 


V2 


7T 


f(k  —  k)  q{k)  dn. 


(7.56) 


■  OO 


Proof :  Combining  the  dehnition  of  the  Fourier  transform  with  the  convolution  for¬ 
mula  (7.54),  we  obtain 


h(k) 


1 


‘OO 


V2 


h(x)  e 


i  k  x 


dx  = 


1 


>oo  r  oo 


f{x~Og(Oe  lkxdxdi 


'X  J  — oo  v2tT  J  — oo  •>  — oo 

Applying  the  change  of  variables  g  =  x  —  ^  in  the  inner  integral  produces 

‘OO  poo 


h(k)  = 


V2 


7T 


f(,v)d(Oe  1  k(€+T>')  d£  drj 


■oo  J  — oo 


‘OO 


=  V2 


7T 


f(ji)  e  lkv  dr] 


[  g{ 0e  lk^dT  =  V2n  f(k)g(k) 

J  —  oo  J 


V2n  ,j  ’  —  oo  J  V  v27r 

proving  (7.55).  The  second  formula  can  be  proved  in  a  similar  fashion,  or  by  simply  noting 
that  it  follows  directly  from  the  Symmetry  Principle  of  Theorem  7.3.  Q.E.D. 

Example  7.14.  We  already  know,  (7.29),  that  the  Fourier  transform  of 


/(*) 


smx 

x 


is  the  box  function 


7T  r 


fX)  =  \l  2  p(k  +  l)  -  u(k  -  1) 

We  also  know  that  the  Fourier  transform  of 


g(x)  =  x 


Therefore,  the  Fourier  transform  of  their  product 


0, 


-1  <  k  <  1 

k  >1. 


h(x)  =  f{x)g{x)  = 


smx 


X‘ 
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can  be  obtained  by  convolution: 


>oo 


h(k)  = 


v2 


7 r 


f  *g(k)  = 


V2 


7T 


/(k)  —  k)  dn 


—  oo 


k  <  —  1, 

-1  <  k  <  1, 

k  >  1. 


Exercises 


7.3.10.  (a)  Find  the  Fourier  transform  of  the  convolution  h(x)  =  f  *g(x)  of  an  even  exponential 


pulse  fe(x)  —  e  'x'  and  a  Gaussian  g(x)  =  e 


—  X 


7.3.11.  What  is  the  convolution  of  a  Gaussian  kernel  e  x 
transform. 


(b)  What  is  h(x)? 
2 


with  itself?  Hint :  Use  the  Fourier 


7.3.12.  Find  the  function  whose  Fourier  transform  is  f(k)  =  ( k 2  +  1)  2. 

C  7.3.13.  (a)  Write  down  the  Fourier  transform  of  the  box  function  f{x)  = 

(b)  Graph  the  hat  function  h(x)  =  /  *  f(x)  and  find  its  Fourier  transform. 

(c)  Determine  the  cubic  B  spline  s(x)  =  h  *  h(x)  and  its  Fourier  transform. 


f 

X 

1  0, 

X 

<  h 
>  h- 


7.3.14.  Let  f(x) 


sinx,  0  <  x  <  7r. 
0,  otherwise, 


g(x ) 


cosx. 

o, 


0  <  X  <  7T, 

otherwise. 


(a)  Find  the  Fourier  transforms  of  f(x)  and  g(x);  (b)  compute  the  convolution 
h(x)  =  /  *  g(x);  (c)  find  its  Fourier  transform  h{k). 

7.3.15.  Use  convolution  to  find  an  integral  formula  for  the  function  whose  Fourier  transform  is 


(a) 


(b) 


sin  k 


(c) 


sin2  k 


(d) 


sign  k 
1  +  i  k 


k2  +  1  k(k2  +  1)  k2  5 

If  possible,  evaluate  the  resulting  convolution  integral. 

7.3.16.  Let  f{x)  be  a  smooth  function,  (a)  Find  its  convolution  S'  *f  with  the  derivative  of  the 
delta  fiunction.  (b)  More  generally,  find  5^  *  /. 

7.3.17.  According  to  Proposition  7.7,  the  Fourier  transform  of  the  derivative  f{x)  is 
obtained  by  multiplying  f(k)  by  i  k.  Can  you  reconcile  this  result  with  the  Convolution 


Theorem  7.13? 

0  7.3.18.  The  Hilbert  transform  of  a  function  f(x)  is  defined  as  the  integral 

„(*>  =  i  r  . 

7 r  J — oo  4  —  x 


(7.57) 


Find  a  formula  for  its  Fourier  transform  h(k)  in  terms  of  f(k).  Remark :  The  bar  on  the 

/  rX  —  5  rOO  \  d£ 

integral  indicates  the  principal  value  integral ,  [21,  which  is  lim  /  +  /  — - 

(5—^0+  \J  —  oc  Jx+5  J  4—  x 


and  is  employed  to  avoid  the  integral  diverging  at  the  singular  point  x  =  £. 


284 


7  Fourier  Transforms 


/OO  _  I  x_t  I 

e  1  ^  1  u(£)  d £  =  f(x). 

Then  verify  your  solution  when  f(x)  =  e~  *  1  ^  *. 

7.3.20.  Suppose  that  f(x)  and  g(x)  are  identically  0  for  all  x  <  0.  Prove  that  their  convolution 

{rX 

Jo  ^ X  0  9(0  x  >  0, 

0,  x  <  0. 

7.3.21.  Given  that  the  support  of  f(pc)  is  contained  in  the  interval  [a,  b]  and  the  support  of 
g{x)  is  contained  in  [c,  d],  what  can  you  say  about  the  support  of  their  convolution 
h(x)  =  f  *g(x)7 

7.3.22.  Prove  the  convolution  properties  (a-e). 

0  7.3.23.  In  this  exercise,  we  explain  how  convolution  can  be  used  to  smooth  out  rough  data.  Let 
g£(x)  =  2  ^ •  (a)  ^  f(x )  any  (reasonable)  function,  show  that  f£(x)  =  g£  *  f(x) 

for  e  /  0  is  a  C°°  function,  (b)  Show  that  lim  f£(x)  =  f(x). 


7.3.24.  Explain  why  the  Shift  Theorem  7.4  is  a  special  case  of  the  Convolution  Theorem  7.13. 

0  7.3.25.  Suppose  f(x)  and  g(x)  are  27r-periodic  and  have  respective  complex  Fourier  coefficients 
ck  and  dk.  Prove  that  the  complex  Fourier  coefficients  ek  of  the  product  function  f(x)  g(x) 

OO 

are  given  by  the  convolution  summation  ek  =  E  Cjdk_y  Hint :  Substitute  the  formulas 

j  =  -oo 

for  the  complex  Fourier  coefficients  into  the  summation,  making  sure  to  use  two  different 
integration  variables,  and  then  use  (6.37). 


7.4  The  Fourier  Transform  on  Hilbert  Space 


While  we  do  not  possess  all  the  analytic  tools  to  embark  on  a  fully  rigorous  treatment  of  the 
mathematical  theory  underlying  the  Fourier  transform,  it  is  worth  outlining  a  few  of  the 
more  important  features.  We  have  already  noted  that  the  Fourier  transform,  when  defined, 
is  a  linear  operator,  taking  functions  f(x)  on  physical  space  to  functions  f(k)  on  frequency 
space.  A  critical  question  is  the  following:  to  precisely  which  function  space  should  the 
theory  be  applied?  Not  every  function  admits  a  Fourier  transform  in  the  classical  sensed 
-  the  Fourier  integral  (7.6)  is  required  to  converge,  and  this  places  restrictions  on  the 
function  and  its  asymptotics  at  large  distances. 


It  turns  out  the  proper  setting  for  the  rigorous  theory  is  the  Hilbert  space  of  complex¬ 
valued  square-integrable  functions  —  the  same  infinite-dimensional  vector  space  that  lies 
at  the  heart  of  modern  quantum  mechanics.  In  Section  3.5,  we  already  introduced  the 
Hilbert  space  L2[a,6]  on  a  finite  interval;  here  we  adapt  Definition  3.34  to  the  entire  real 
line.  Thus,  the  Hilbert  space  L2  =  L2(IR)  is  the  infinite-dimensional  vector  space  consisting 
of  all  complex- valued  functions  f(x)  that  are  defined  for  all  x  E  M  and  have  finite  L2  norm: 


OO 

2  dx  <  oo.  (7.58) 

—  OO 


We  leave  aside  the  more  advanced  issues  involving  generalized  functions. 
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For  example,  any  piecewise  continuous  function  that  satisfies  the  decay  criterion 

M 


f  O)  < 


for  all  sufficiently  large 


x 


>  0. 


(7.59) 


x  |1/2+5  ’ 

for  some  M  >  0  and  5  >  0,  belongs  to  L2.  However,  as  in  Section  3.5,  Hilbert  space  contains 
many  more  functions,  and  the  precise  definitions  and  identification  of  its  elements  is  quite 
subtle.  On  the  other  hand,  most  nondecaying  functions  do  not  belong  to  L2,  including  the 
constant  function  f(x)  =  1  as  well  as  all  oscillatory  complex  exponentials,  elkx  for  k  E  M. 

The  Hermitian  inner  product  on  the  complex  Hilbert  space  L2  is  prescribed  in  the 
usual  manner. 


f’s)=L 


oo 


so  that  ||  / 


2  _ 


f{x)  g(x)  dx. 

) 

f  ,  / ) .  The  Cauchy-Schwarz  inequality 

f  ,g)\<  II  /II II  g 


(7.60) 


(7.61) 

ensures  that  the  inner  product  integral  is  finite  whenever  f,g  G  L2.  Observe  that  the 
Fourier  transform  (7.6)  can  be  regarded  as  a  multiple  of  the  inner  product  of  the  function 
f{x)  with  the  complex  exponential  functions: 


/(*) 


i 


‘OO 


f(x)  e 


i  k  x 


dx  = 


1 


(f(x) 


i  kx 


(7.62) 


\  2  7T  J  —  oo  \j2/X 

However,  when  interpreting  this  formula,  one  must  bear  in  mind  that  the  exponentials  are 
not  themselves  elements  of  L2. 

Let  us  state  the  fundamental  result  governing  the  effect  of  the  Fourier  transform 
on  functions  in  Hilbert  space.  It  can  be  regarded  as  a  direct  analogue  of  the  Pointwise 
Convergence  Theorem  3.8  for  Fourier  series. 

Theorem  7.15.  If  f(x)  E  L2  is  square-integrable ,  then  its  Fourier  transform  f{k)  E 
L2  is  a  well-defined ,  square-integrable  function  of  the  frequency  variable  k.  If  f{x)  is 
continuously  differentiable  at  a  point  x,  then  the  inverse  Fourier  transform  integral  (7.9) 
equals  its  value  f{x).  More  generally  if  the  left-  and  right-hand  limits  f(x~),  /(.x+), 
f'(x~),  f  (,x+)  exist,  then  the  inverse  Fourier  transform  integral  converges  to  the  average 


value  \[f{x  )  + /(x+) 

Thus,  the  Fourier  transform  /  =  IF[f]  defines  a  linear  transformation  from  L2  func¬ 
tions  of  x  to  L2  functions  of  k.  In  fact,  the  Fourier  transform  preserves  inner  products. 
This  important  result  is  known  as  ParsevaVs  formula ,  whose  Fourier  series  counterpart 
appeared  in  (3.122). 

Theorem  7.16.  If  f(k)  =  F[f{x)\  and  g(k)  =  fF[gix)},  then  (  /  ,  g )  =  (  /  ,  g),  he., 


‘OO 


‘OO 


f{x)  g{x)  dx 


f(k)g(k)dk. 


(7.63) 


■OO 


■oo 


Proof :  Let  us  sketch  a  formal  proof  that  serves  to  motivate  why  this  result  is  valid. 
We  use  the  definition  (7.6)  of  the  Fourier  transform  to  evaluate 


‘OO 


‘OO 


f{k)  g(k)  dk 


1 


■OO 


■oo 


V2 


f(x)e 


i  k  x 


dx 


1 


‘OO 


giy)  e+lky  dy  )  dk 


-oo  \  V  ^  kX  ,J  —  oo  J  \  v  2  7T  j  — oo 

■oo  poo 


fix)  giy)  (  —  J  e  lk<yX  y^dk^jdxdy. 


—  oo 


—  oo 


—  oo 
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Now  according  to  (7.38),  the  inner  k  integral  can  be  replaced  by  the  delta  function  S(x  —  y ), 
and  hence 


f(k)  g(k )  dk 


f{x)g{y)5(x 


y )  dx  dy 


f(x)  g(x)  dx. 


This  completes  our  “proof”;  see  [37,  68,  117]  for  a  rigorous  version 


Q.E.D. 


In  particular,  orthogonal  functions,  satisfying  (  /  ,  <7 )  =0,  will  have  orthogonal  Fourier 
transforms,  (  f,g)  =0.  Choosing  f  —  g  in  Parseval’s  formula  (7.63)  produces  PlanchereVs 
formula 


OO  nOO 

2  dx  =  /  \f(k)\2dk.  (7.64) 

J—  oo  J  —  OO 

Thus,  the  Fourier  transform  T\  L2  — >  L2  defines  a  norm-preserving,  or  unitary ,  linear 
transformation  on  Hilbert  space,  mapping  L2  functions  of  the  physical  variable  x  to  L2 
functions  of  the  frequency  variable  k. 


f  r  =  / 


or,  explicitly,  /  /(x) 


Quantum  Mechanics  and  the  Uncertainty  Principle 


In  its  popularized  form,  the  Heisenberg  Uncertainty  Principle  is  a  by  now  familiar  philo¬ 
sophical  concept.  First  formulated  in  the  1920s  by  the  German  physicist  Werner  Heisen¬ 
berg,  one  of  the  founders  of  modern  quantum  mechanics,  it  states  that,  in  a  physical 
system,  certain  quantities  cannot  be  simultaneously  measured  with  complete  accuracy. 
For  instance,  the  more  precisely  one  measures  the  position  of  a  particle,  the  less  accuracy 
there  will  be  in  the  measurement  of  its  momentum;  conversely,  the  greater  the  accuracy 
in  the  momentum,  the  less  certainty  in  its  position.  A  similar  uncertainty  couples  energy 
and  time.  Experimental  verification  of  the  uncertainty  principle  can  be  found  even  in  fairly 
simple  situations.  Consider  a  light  beam  passing  through  a  small  hole.  The  position  of  the 
photons  is  constrained  by  the  hole;  the  effect  of  their  momenta  is  observed  in  the  pattern 
of  light  diffused  on  a  screen  placed  beyond  the  hole.  The  smaller  the  hole,  the  more  con¬ 
strained  the  photon’s  position  as  it  passes  through,  hence,  according  to  the  Uncertainty 
Principle,  the  less  certainty  there  is  in  the  observed  momentum,  and,  consequently,  the 
wider  and  more  diffuse  the  resulting  image  on  the  screen. 

This  is  not  the  place  to  discuss  the  philosophical  and  experimental  consequences  of 
Heisenberg’s  Principle.  What  we  will  show  is  that  the  Uncertainty  Principle  is,  in  fact,  a 
mathematical  property  of  the  Fourier  transform!  In  quantum  theory,  each  of  the  paired 
quantities,  e.g.,  position  and  momentum,  are  interrelated  by  the  Fourier  transform.  Indeed, 
Proposition  7.7  says  that  the  Fourier  transform  of  the  differentiation  operator  representing 
momentum  is  a  multiplication  operator  representing  position  and  vice  versa.  This  Fourier- 
transform-based  duality  between  position  and  momentum,  that  is,  between  multiplication 
and  differentiation,  lies  at  the  heart  of  the  Uncertainty  Principle. 

In  quantum  mechanics,  the  wave  functions  of  a  quantum  system  are  characterized  as 
the  elements  of  unit  norm,  ||  cp  ||  =  1,  belonging  to  the  underlying  state  space,  which,  in 
a  one-dimensional  model  of  a  single  particle,  is  the  Hilbert  space  L2  =  L2(IR)  consisting 
of  square-integrable  complex- valued  functions  of  x.  As  we  already  noted  in  Section  3.5, 
the  squared  modulus  of  the  wave  function,  |  tp(x)  |2,  represents  the  probability  density  of 
the  particle  being  found  at  position  x.  Consequently,  the  mean  or  expected  value  of  any 
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function  /(x)  of  the  position  variable  is  given  by  its  integral  against  the  system’s  probability 
density  and  denoted  by 


‘OO 


</(*)} 


f(x)  |  <p(x)  |2  dx. 


(7.65) 


In  particular, 


—  OO 


‘OO 


X  = 


x  |  <p(x)  |2  dx 


(7.66) 


—  OO 


is  the  expected  measured  position  of  the  particle,  while  Ax,  defined  by 


(Ax)' 


( x  — 


X 


(x2)  —  (xj 


(7.67) 


which  is  the  probability  density’s  variance ,  is  the  statistical  deviation  of  the  particle’s 
measured  position  from  the  mean.  We  note  that  the  next-to-last  term  equals 


x2  )  = 


/OO 

x2  |  v?(x)  |2  dx  —  ||  x  (p(x) 

-oo 


(7.68) 


On  the  other  hand,  the  momentum  variable  p  is  related  to  the  Fourier  transform 
frequency  via  the  de  Broglie  relation  p  =  hk,  where 


h  = 


h 

2n 


1.055  x  10  34  joule  seconds 


(7.69) 


is  Planck’s  constant ,  whose  value  governs  the  quantization  of  physical  quantities.  There¬ 
fore,  the  mean,  or  expected  value,  of  any  function  of  momentum  g(p)  is  given  by  its  integral 
against  the  squared  modulus  of  the  Fourier  transformed  wave  function: 


‘OO 


(g(p)) 


g(h  k )  |  <p(k)  |2  dk. 


(7.70) 


—  OO 


In  particular,  the  mean  of  the  momentum  measurements  of  the  particle  is 


P 


/OO  /‘OO 

k  |  (p(k)  |2  dk  =  —  i  h  /  p\x)  p(x)  dx  =  —  i  h  ( cpf ,  cp 
-OO  4—00 


(7.71) 


where  we  used  Parseval’s  formula  (7.63)  to  convert  to  an  integral  over  position,  and  (7.43) 
to  infer  that  k(p{k )  is  the  Fourier  transform  of  —  ip'(x).  Similarly, 


(Ap)2  =  ( (p-  (p))2)  =  (p2)  ~  {v) 


(7.72) 


is  the  squared  variance  of  the  momentum,  where,  by  Plancherel’s  formula  (7.64)  and  (7.43) 


/OO  n  OO 

k2  |  (p[k)  |2  dk  =  h2  /  |  i  k  (p{k)  \2  dk 

-OO  4—00 


=  ft 


—  oo 
oo 

-oo 


(7.73) 


<p'(x)  | 2  dx  =  h2  ||  p'(x) 


With  this  interpretation,  the  Uncertainty  Principle  for  position  and  momentum  mea¬ 
surements  can  be  stated. 


288 


7  Fourier  Transforms 


Theorem  7.17.  If  p(x)  is  a  wave  function ,  so  ||  p  ||  =  1,  then  the  observed  variances 
in  position  and  momentum  satisfy  the  inequality 


Ax  A p  >  \  h. 


(7.74) 


Now,  the  smaller  the  variance  of  a  quantity  such  as  position  or  momentum,  the  more 
accurate  will  be  its  measurement.  Thus,  the  Heisenberg  inequality  (7.74)  effectively  quan¬ 
tifies  the  statement  that  the  more  accurately  we  are  able  to  measure  the  momentum  p,  the 
less  accurate  will  be  any  measurement  of  its  position  x,  and  vice  versa.  For  more  details, 
along  with  physical  and  experimental  consequences,  you  should  consult  an  introductory 
text  on  mathematical  quantum  mechanics,  e.g.,  [66,  72 


Proof :  For  any  value  of  the  real  parameter  t. 


0  <  ||  txp(x )  + 


=  t2  ||  x  p(x)  ||2  +  t  (  (p'(x)  ,  X  p(x) )  +  (  X  (fix)  ,  (f'(x)  )  +  ||  (p\x) 

The  middle  term  in  the  final  expression  can  be  evaluated  as  follows: 


(7.75) 


(  ip'(x)  ,  X  if(x)  )  +  (  X  (fi(x)  ,  ip’(x)  )  = 


■oo 

—  oo 

“OO 

■oo 


X  p'(x)  (f(x)  +  X  ip(x)  ip'(x)  dx 
d 


X 


dx 


/oo 

|  (p(x)  |2  dx  =  — 1 

-oo 


via  an  integration  by  parts,  noting  that  the  boundary  terms  vanish,  provided  (p(x)  satisfies 
the  L2  decay  criterion  (7.59).  Thus,  in  view  of  (7.68)  and  (7.73),  the  inequality  in  (7.75) 
reads 


x2  )  t2  -  t  +  >  0 


for  all 


t  E  M. 


The  minimum  value  of  the  left-hand  side  occurs  at  =  1/(2  (x2 )),  where  its  value  is 


h2 


4  ( x2 ) 


>0, 


which  implies 


x 


2  \  /  _2  \  \  1  t2 


To  obtain  the  uncertainty  relation  (7.74),  one  performs  the  selfsame  calculation,  but  with 
x  —  ( x )  replacing  x  and  p  —  (p)  replacing  p.  The  result  is 


(x  —  ( x  ))2  )  t2  —  t  + 


(p  -  {p)y 

h2 


=  (Ax)2 12  —  t  + 


(A  p): 
ft2 


>  0. 


(7.76) 


Substituting  t  =  1/(2 (Ax)2)  produces  the  Heisenberg  inequality  (7.74). 


Q.E.D. 


Exercises 


7.4.1.  (a)  Write  out  the  Plancherel  formula  for  the  square  wave  pulse  f(x) 

o 

/7  x  t t ti  .  f°°  sin  x  7  _ 

(b)  What  is  /  — 7> — dxl 

Jo  x2 


<  1, 

>  1. 
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7.4.2.  Apply  the  Plancherel  formula  to  the  even  decaying  pulse  (7.20)  to  evaluate 

dx 

.  How  would  you  compute  this  integral  using  elementary  calculus? 


*oo 


n  signx. 


x 


<  1 
^  n’ 


where 


J— oo  (a2  +  x2)2 

T  7.4.3.  (a)  Find  the  Fourier  transform  of  the  function  f„(x)  = 

v  /  ^  /T  \  /  i  r\  ii 

0,  otherwise. 

n  is  a  positive  integer,  (b)  Write  out  the  Plancherel  formula  for  fn(x).  (c)  Determine  the 
limit,  as  n  ^  oo,  of  the  Fourier  transform  of  fn(x).  (d)  Explain  why  the  limit  should  be 
the  Fourier  transform  of  the  derivative  of  the  delta  function  Sr(x). 

7.4.4.  Prove  that  Parseval’s  formula  is  a  consequence  of  Plancherel’s  formula.  Hint:  Use  the 
identity  in  Exercise  3.5.34(b). 

0  7.4.5.  Prove  that  the  Hilbert  space  L2(R)  is  a  complex  vector  space. 

0  7.4.6.  We  did  not  quite  tell  the  truth  when  we  said  that  L  functions  must  decay  at  large 

distances:  Prove  that  the  following  function  is  in  L2  but  does  not  go  to  zero  as  \  x\  oo: 

1,  n  —  n~2  <  x  <  n  +  n~2  for  n  =  ±1,  ±2,  ±3, . . .  , 

0,  otherwise. 


f{x) 


7.4.7.  Modify  the  function  in  Exercise  7.4.6  to  produce  a  function  /  G  L  that  nevertheless 
satisfies  lim  f(n)  =  oo  for  n  G  Z. 


n 


±  oo 


r\  -i 

0  7.4.8.  Suppose  /  G  L  is  continuously  differentiable,  /  G  C  ,  and  has  bounded  derivative: 
|  f'(x)  I  A  M  for  all  xGl.  Prove  that  f(x)  0  as  x  Too. 


7.4.9.  (a)  Find  the  constant  a  >  0  such  that  (f(x)  =  ae  1  x  1  is  a  wave  function. 

(b)  Verify  the  Heisenberg  inequality  (7.74)  for  this  particular  wave  function. 


7.4.10.  Answer  Exercise  7.4.9  when  (a)  <p(x)  =  ae 


X 


( b )  ip(x) 


a 


1  +  x 


2  • 


^  7.4.11.  Write  out  a  detailed  derivation  of  the  final  inequality  (7.76). 


Chapter  8 

Linear  and  Nonlinear  Evolution  Equations 


The  term  evolution  equation  refers  to  a  dynamical  partial  differential  equation  that  involves 
both  time  t  and  space  x  =  (aq, . . . ,  xn)  as  independent  variables  and  takes  the  form 


du 

dt 


whose  left-hand  side  is  just  the  first-order  time  derivative  of  the  dependent  variable  u, 
while  the  right-hand  side,  which  can  be  linear  or  nonlinear,  involves  only  u  and  its  space 
derivatives  and,  possibly,  t  and  x.  Examples  already  encountered  include  the  linear  and 
nonlinear  transport  equations  in  Chapter  2  and  the  heat  equation.  (But  not  the  wave 
equation  or  Laplace  equation.)  In  this  chapter,  we  will  analyze  several  important  evolution 
equations,  both  linear  and  nonlinear,  involving  a  single  spatial  variable. 

Our  first  stop  is  to  revisit  the  heat  equation.  We  introduce  the  fundamental  solution, 
which,  for  dynamical  partial  differential  equations,  assumes  the  role  of  the  Green’s  function, 
in  that  its  initial  condition  is  a  concentrated  delta  impulse.  The  fundamental  solution  leads 
to  an  integral  superposition  formula  for  the  solutions  produced  by  more  general  initial 
conditions  or  by  external  forcing.  For  the  heat  equation  on  the  entire  real  line,  the  Fourier 
transform  enables  us  to  construct  an  explicit  formula  that  identifies  its  fundamental  solution 
as  a  Gaussian  Liter.  We  next  present  the  Maximum  Principle  that  rigorously  justifies 
the  entropic  decay  of  temperature  in  a  heated  body  and  underlies  much  of  the  advanced 
mathematical  analysis  of  parabolic  partial  differential  equations.  Finally,  we  discuss  the 
Black-Scholes  equation,  the  paradigmatic  model  for  investment  portfolios,  first  proposed 
in  the  early  1970s  and  now  lying  at  the  heart  of  the  modern  financial  industry.  We  will 
find  that  the  Black-Scholes  equation  can  be  transformed  into  the  linear  heat  equation, 
whose  fundamental  solution  is  applied  to  establish  the  celebrated  Black-Scholes  formula 
for  option  pricing. 

The  following  section  provides  a  brief  introduction  to  symmetry-based  solution  tech¬ 
niques  for  linear  and  nonlinear  partial  differential  equations.  Knowing  a  symmetry  of  a 
partial  differential  equation  allows  one  to  readily  construct  additional  solutions  from  any 
known  solution.  Solutions  that  remain  invariant  under  a  one-parameter  family  of  symme¬ 
tries  can  be  found  by  solving  a  reduced  ordinary  differential  equation.  The  most  important 
are  the  traveling  wave  solutions,  which  are  invariant  under  translation  symmetries,  and 
similarity  solutions,  which  are  invariant  under  scaling  symmetries. 

The  next  evolution  equation  to  appear  is  a  paradigmatic  model  of  nonlinear  diffusion 
known  as  Burgers’  equation.  It  can  be  regarded  as  a  very  simplified  model  of  fluid  dynamics, 
combining  both  nonlinear  and  viscous  effects.  We  discover  a  remarkable  nonlinear  change 
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of  variables  that  maps  Burgers’  equation  to  the  linear  heat  equation,  and  thereby  facilitates 
its  analysis,  allowing  us  to  construct  explicit  solutions,  and  investigate  how  they  converge 
to  shock  wave  solutions  of  the  nonlinear  transport  equation  in  the  inviscid  limit. 

Next,  we  turn  onr  attention  to  the  simplest  third-order  linear  evolution  equation,  which 
arises  as  a  model  for  wave  mechanics.  Unlike  first-  and  second-order  wave  equations,  its 
solutions  are  not  simple  traveling  waves,  but  instead  exhibit  dispersion,  in  which  oscillatory 
waves  of  different  frequencies  move  at  different  speeds.  As  a  result,  initially  localized 
disturbances  will  spread  out  or  disperse,  even  while  they  conserve  the  underlying  energy. 
Dispersion  implies  that  the  individual  wave  velocities  differ  from  the  group  velocity,  which 
measures  the  speed  of  propagation  of  energy  in  the  system.  An  everyday  manifestation  of 
this  phenomenon  can  be  observed  in  the  ripples  caused  by  throwing  a  rock  into  a  pond: 
the  individual  waves  move  faster  than  the  overall  disturbance.  Finally,  we  present  the 
remarkable  Talbot  effect,  only  recently  discovered,  in  which  solutions  having  discontinuous 
initial  data  and  subject  to  periodic  boundary  conditions  exhibit  radically  different  profiles 
at  rational  and  irrational  times. 


Onr  final  example  is  the  celebrated  Korteweg-de  Vries  equation,  which  originally  arose 
in  the  work  of  the  nineteenth-century  French  applied  mathematician  Joseph  Bonssinesq  as 
a  model  for  surface  waves  on  shallow  water.  It  combines  the  effects  of  linear  dispersion  and 
nonlinear  transport.  Unlike  the  linearly  dispersive  model,  the  Korteweg-de  Vries  equation 
admits  explicit,  localized  traveling  wave  solutions,  now  known  as  “solitons”.  Remark¬ 
ably,  despite  the  potentially  complicated  nonlinear  nature  of  their  interaction,  two  solitons 
emerge  from  a  collision  with  their  individual  profiles  preserved,  the  only  residual  effect 
being  a  relative  phase  shift.  The  Korteweg-de  Vries  equation  is  the  prototype  of  a  com¬ 
pletely  integrable  partial  differential  equation,  whose  many  remarkable  properties  were 
first  discovered  in  the  mid  1960s.  A  surprising  number  of  such  completely  integrable  non¬ 
linear  systems  appear  in  a  variety  of  applications,  including  dynamical  models  in  fluids, 
plasmas,  optics,  and  solid  mechanics.  Their  analysis  remains  an  extremely  active  area  of 
contemporary  research,  [2,36]. 


8.1  The  Fundamental  Solution  to  the  Heat  Equation 

One  disadvantage  of  the  Fourier  series  solution  to  the  heat  equation  is  that  it  is  not  nearly 
as  explicit  as  one  might  desire  for  practical  applications,  numerical  computations,  or  even 
further  theoretical  investigations  and  developments.  An  alternative  approach  is  based  on 
the  idea  of  the  fundamental  solution ,  which  plays  the  role  of  the  Green’s  function  in  solving 
initial  value  problems.  The  fundamental  solution  measures  the  effect  of  a  concentrated, 
instantaneous  impulse,  either  in  the  initial  conditions  or  as  an  external  force  on  the  system. 

We  restrict  our  attention  to  homogeneous  boundary  conditions  —  keeping  in  mind 
that  these  can  always  be  included  by  use  of  linear  superposition.  The  basic  idea  is  to 
analyze  the  case  in  which  the  initial  data  u(0,x)  =  <5g(x)  =  S(x  —  £)  is  a  delta  function, 
which  we  can  interpret  as  a  highly  concentrated  unit  heat  source,  e.g.,  a  soldering  iron  or 
laser  beam,  that  is  instantaneously  applied  at  a  position  £  along  a  metal  bar.  The  heat 
will  diffuse  away  from  its  initial  concentration,  and  the  resulting  fundamental  solution  is 
denoted  by 

u(£,  x)  =  F(£,  x\ £),  with  F(0,  x\  £)  =  S(x  —  £).  (8-2) 

For  each  fixed  £,  the  fundamental  solution,  considered  as  a  function  of  t  >  0  and  x,  must 
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satisfy  the  underlying  partial  differential  equation,  and  so,  for  the  heat  equation, 


OF  _  d2F 
dt  ^  dx2  ’ 


along  with  the  specified  homogeneous  boundary  conditions. 

As  with  the  Green’s  function,  once  we  have  determined  the  fundamental  solution,  we 
can  then  use  linear  superposition  to  reconstruct  the  general  solution  to  the  initial-boundary 
value  problem.  Namely,  we  first  write  the  initial  data 


u(0,x)  =  f(x)  =  f  S(x  -  £,)  fiOdt,  (8.4) 

J  a 

as  a  superposition  of  delta  functions,  as  in  (6.16).  Linearity  implies  that  the  solution 
can  be  expressed  as  the  corresponding  superposition  of  the  responses  to  those  individual 
concentrated  delta  profiles: 


u(t,x)=f  F(t,x;0  /(£)</£.  (8.5) 

J  a 

Assuming  that  we  can  differentiate  under  the  integral  sign,  the  fact  that  F(t,  #;£)  satis¬ 
fies  the  differential  equation  and  the  homogeneous  boundary  conditions  for  each  fixed  £ 
immediately  implies  that  the  integral  (8.5)  is  also  a  solution  with  the  correct  initial  and 
(homogeneous)  boundary  conditions. 

Unfortunately,  most  boundary  value  problems  do  not  have  fundamental  solutions  that 
can  be  written  down  in  closed  form.  An  important  exception  is  the  case  of  an  infinitely 
long  homogeneous  bar,  which  requires  solving  the  heat  equation  on  the  entire  real  line: 


du  d2 


u 


dt  dx 2 


for 


—  OO  <  X  <  oo. 


t  >  0. 


(8.6) 


For  simplicity,  we  have  chosen  units  in  which  the  thermal  diffusivity  is  7  =  1.  The  solution 
u(£,  x)  is  defined  for  all  x  E  M,  and  has  initial  conditions 


u(0,  x)  =  f{x)  for  —  00  <  x  <  00.  (8-7) 

In  order  to  specify  the  solution  uniquely,  we  shall  require  that  the  temperature  be  square- 
integrable,  i.e.,  in  L2,  at  all  times,  so  that 


u{t,  x)  |2  dx  <  00 


for  all  t  >  0. 


Roughly  speaking,  square-integrability  requires  that  the  temperature  be  vanishingly  small 
at  large  distances,  and  hence  plays  the  role  of  boundary  conditions  in  this  context. 

To  solve  the  initial  value  problem  (8.6-7),  we  apply  the  Fourier  transform,  in  the  x 
variable,  to  both  sides  of  the  differential  equation.  In  view  of  the  effect  of  the  Fourier 
transform  on  derivatives,  cf.  (7.43),  the  result  is 


where 


du 

~dt 


u(t,  k ) 


1 


‘OO 


u{t,x)e  lkx  dx 


V2 


7 r 


■00 


(8.9) 

(8.10) 
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t  =  10 


Figure  8.1.  The  fundamental  solution  to  the  one-dimensional  heat  equation.  [+j 


is  the  Fourier  transformed  solution.  For  each  fixed  fc,  (8.9)  can  be  viewed  as  a  first-order 
linear  ordinary  differential  equation  for  u(t,  fc),  with  initial  conditions 

—  1  l'°° 

u(0,k)  =  f(k)  =  —7=  /  f{x)e~lkxdx 

V  2  7T  J  —  oo 

given  by  Fourier  transforming  the  initial  data  (8.7).  The  solution  to  the  initial  value 
problem  (8.9, 11)  is  immediate: 

u(t,k)  =  e~k  1  f(k).  (8.12) 

We  can  thus  recover  the  solution  to  the  initial  value  problem  (8.6-7)  by  applying  the  inverse 
Fourier  transform  to  (8.12),  leading  to  the  explicit  integral  formula 


(8.11) 


u(t,  x) 


elkx  u(t,  k )  dk 


ikx~k2tf(k)  dk. 


(8.13) 


In  particular,  to  construct  the  fundamental  solution,  we  take  the  initial  temperature 
profile  to  be  a  delta  function  dg(x)  =  S(x  —  £)  concentrated  at  x  =  £.  According  to  (7.37), 
its  Fourier  transform  is 

e~ik* 


Plugging  this  into  (8.13),  and  then  referring  to  our  table  of  Fourier  transforms,  we  are  led 
to  the  following  explicit  formula  for  the  fundamental  solution: 


1 

27 r 


gi  k(x-Z)-k2t 


1  e-(x-Q2/(4t) 

2^/rrb 


for  t  >  0.  (8.14) 


As  you  can  verify,  for  each  fixed  £,  the  function  F(£,  x\  ^)  is  indeed  a  solution  to  the  heat 
equation  for  all  t  >  0.  In  addition, 


lim  F(t,  x;£) 
t->  o+ 


0, 

oo, 


x  7^  £? 
x  =  £. 
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Furthermore,  its  integral 


/oo 

F(t,  x ;  £)  dx  =  1 

-co 


(8.15) 


is  constant  —  in  accordance  with  the  law  of  conservation  of  thermal  energy;  see  Exercise 
8.1.20.  Therefore,  as  t  -T  0+,  the  fundamental  solution  satisfies  the  original  limiting 
definition  (6.8-9)  of  the  delta  function,  and  so  F(0,aq£)  =  Sg(x)  has  the  desired  initial 
temperature  profile. 

In  Figure  8.1,  we  graph  F(t,x;  0)  at  the  indicated  times.  It  starts  life  as  a  delta 
spike  concentrated  at  the  origin,  and  then  immediately  smooths  out  into  a  tall  and  narrow 
bell-shaped  curve,  centered  at  x  =  0.  As  time  increases,  the  solution  shrinks  and  widens, 
eventually  decaying  everywhere  to  zero.  Its  amplitude  is  proportional  to  t-1/2,  while  its 
overall  width  is  proportional  to  t1/2.  The  thermal  energy  (8.15),  which  is  the  area  under 
the  graph,  remains  fixed  while  gradually  spreading  out  over  the  entire  real  line. 


Remark :  In  probability,  these  exponentially  bell-shaped  curves  are  known  as  normal  or 
Gaussian  distributions ,  [39].  The  width  of  the  bell  curve  measures  its  standard  deviation. 
For  this  reason,  the  fundamental  solution  to  the  heat  equation  is  sometimes  referred  to  as 
a  Gaussian  filter. 

Remark:  The  fact  that  the  fundamental  solution  depends  only  on  the  difference  x  —  £, 
and  hence  has  the  same  profile  at  all  £  E  M,  is  a  consequence  of  the  translation  invariance 
of  the  heat  equation,  reflecting  the  fact  that  it  models  the  thermodynamics  of  a  uniform 
medium.  See  Section  8.2  for  additional  symmetry  properties  of  the  heat  equation  and  its 
solutions. 


Remark:  One  of  the  striking  properties  of  the  heat  equation  is  that  thermal  energy 
propagates  with  infinite  speed.  Indeed,  because,  at  any  t  >  0,  the  fundamental  solution 
is  nonzero  for  all  x,  the  effect  of  an  initial  concentration  of  heat  will  immediately  be  felt 
along  the  entire  length  of  an  infinite  bar.  (The  graphs  in  Figure  8.1  are  a  little  misleading 
because  they  fail  to  show  the  extremely  small,  but  still  positive,  exponentially  decreasing 
tails.)  This  effect,  while  more  or  less  negligible  at  large  distances,  is  nevertheless  in  clear 
violation  of  physical  intuition  —  not  to  mention  relativity,  which  postulates  that  signals 
cannot  propagate  faster  than  the  speed  of  light.  Despite  this  non-physical  artifact,  the  heat 
equation  remains  an  accurate  model  for  heat  propagation  and  similar  diffusive  phenomena, 
and  so  continues  to  be  successfully  used  in  applications. 


With  the  fundamental  solution  in  hand,  we  can  adapt  the  linear  superposition  for¬ 
mula  (8.5)  to  reconstruct  the  general  solution 


/>oo 

<  e-(*-«2/(4t>/(0de 

—  OO 

to  our  initial  value  problem  (8.6).  This  solution  formula  is  merely  a  restatement  of  (8.13) 
combined  with  the  Fourier  transform  formula  (8.11).  Comparing  with  (7.54),  we  see  that 
the  solutions  are  obtained  by  convolution  of  the  initial  data  with  a  one-parameter  family 
of  progressively  wider  and  shorter  Gaussian  filters: 


(8.16) 


u(t,  x) 


2  y/irt 


u(t,x)  =  F0(t,x)  *  f(x), 


e-x2/(4 1) 

where  F0  (£,  x)  =  F(t,  x\  0)  =  - - 

2  v 7 it 


Since  u(t,  x)  solves  the  heat  equation,  we  conclude  that  Gaussian  filter  convolution  has  the 
same  smoothing  effect  on  the  initial  signal  f{x).  Indeed,  the  convolution  integral  (8.16) 
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Figure  8.2. 


Error  function  solution  to  the  heat  equation. 


serves  to  replace  each  initial  value  f(x)  by  a  weighted  average  of  nearby  values,  the  weight 
being  determined  by  the  Gaussian  distribution.  This  has  the  effect  of  smoothing  out  high- 
frequency  variations  in  the  signal,  and,  consequently,  the  Gaussian  convolution  formula 
(8.16)  provides  an  effective  method  for  denoising  rough  signals  and  data. 

Example  8.1.  An  infinite  bar  is  initially  heated  to  unit  temperature  along  a  finite 
interval.  The  initial  temperature  profile  is  thus  a  box  function 


n(0,  x )  =  f(x)  =  cr(x  —  a)  —  cr(x  —  b ) 


1,  a  <  x  <  6, 

0,  otherwise. 


The  ensuing  temperature  is  provided  by  the  solution  to  the  heat  equation  obtained  by  the 
integral  formula  (8.16): 


«(*,: e)  =  — L=  [  e-(x-«)2/(4t)^=  1 

1  '  2  XTt  Ja  ?  2 

where  erf  denotes  the  error  function,  as  defined  in  (2.87).  Graphs  of  the  solution  (8.17)  for 
a  —  —  5,  6  =  5,  at  the  indicated  times,  are  displayed  in  Figure  8.2.  Observe  the  instanta¬ 
neous  smoothing  of  the  sharp  interface  and  instantaneous  propagation  of  the  disturbance, 
followed  by  a  gradual  decay  to  thermal  equilibrium,  with  u(t,x)  — 0  as  t  — oo. 


erf 


f  x  —  a 

V  2  Vt 


—  erf 


x  —  b 

2  Xt 


(8.17) 


The  Forced  Heat  Equation  and  DuhameVs  Principle 

The  fundamental  solution  approach  can  be  also  applied  to  solve  the  inhomogeneous  heat 
equation 

ut  =  uxx  +  x)i  (8.18) 

modeling  a  bar  subject  to  an  external  heat  source  h(t,x),  which  might  depend  on  both 
position  and  time.  We  begin  by  solving  the  particular  case 


Ut  =Uxx  +  S(t-T)S(X-0, 


(8.19) 
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whose  inhomogeneity  represents  a  heat  source  of  unit  magnitude  that  is  concentrated  at  a 
position  x  =  ^  and  applied  at  a  single  time  t  —  r  >  0.  Physically,  this  models  the  effect  of 
instantaneously  applying  a  soldering  iron  to  a  single  spot  on  the  bar.  Let  us  also  impose 
homogeneous  initial  conditions 

a(0,x)  =  0  (8.20) 

as  well  as  homogeneous  boundary  conditions  of  one  of  our  standard  types.  The  resulting 
solution 

n(£,  x )  =  G(£,  x ;  r,  £)  (8.21) 

will  be  referred  to  as  the  general  fundamental  solution  to  the  heat  equation.  Since  a  heat 
source  that  is  applied  at  time  r  will  affect  the  solution  only  at  later  times  t  >  r,  we  expect 
that 

C?(t,  x;  r,  £)  =  0  for  all  t  <  r.  (8.22) 

Indeed,  since  u(t,  x)  solves  the  unforced  heat  equation  at  all  times  t  <  r  subject  to  ho¬ 
mogeneous  boundary  conditions  and  has  zero  initial  temperature,  this  follows  immediately 
from  the  uniqueness  of  the  solution  to  the  initial-boundary  value  problem. 

Once  we  know  the  general  fundamental  solution  (8.21),  we  are  able  to  solve  the  problem 
for  a  general  external  heat  source  (8.18).  We  first  write  the  forcing  as  a  superposition 

nb 

S(t  —  r)  5(x  —  £)  /i(r,  £)  dr  (8.23) 

of  concentrated  instantaneous  heat  sources.  Linearity  allows  us  to  conclude  that  the  solu¬ 
tion  is  given  by  the  self-same  superposition  formula 

u(t,  x)  =  f  f  G(t,  x\  r,  £)  h{r,  £)  d^  dr.  (8.24) 

J  0  J  a 

The  fact  that  we  only  need  to  integrate  over  times  0  <  r  <  t  is  a  consequence  of  (8.22). 

Remark :  If  we  have  a  nonzero  initial  condition,  u( 0,  x)  =  /(x),  then,  by  linear  super¬ 
position,  the  solution 


u(t,x)=  F(t,x;£)f(C)d£+  /  G(t,x;T,£)h(T,£)d£d,T  (8.25) 

J  a  Jo  J  a 

is  a  combination  of  (a)  the  solution  with  no  external  heat  source,  but  nonzero  initial 
conditions,  plus  (b)  the  solution  with  homogeneous  initial  conditions  but  nonzero  heat 
source. 


Let  us  explicitly  solve  the  forced  heat  equation  on  an  infinite  interval  —  oo  <  x  <  oo. 
We  begin  by  computing  the  general  fundamental  solution.  As  before,  we  take  the  Fourier 
transform  of  both  sides  of  the  partial  differential  equation  (8.18)  with  respect  to  x.  In  view 
of  (7.37,43),  we  find 


d  u 
~dt 


e~  ik^  S(t  —  r), 


(8.26) 


which  is  an  inhomogeneous  first-order  ordinary  differential  equation  for  the  Fourier  trans¬ 
form  u{t,k)  of  a(t,x),  while  (8.20)  implies  the  initial  condition 


fi(0,  k )  =  0. 


(8.27) 


298 


8  Linear  and  Nonlinear  Evolution  Equations 


We  solve  the  initial  value  problem  (8.26-27)  by  the  usual  method,  [18,23].  Multiplying 
the  differential  equation  by  the  integrating  factor  ek  t  yields 


d 


( ek  tu) 


,k2t-ik£ 


S(t  -  t) 


dt  v27 r 

Integrating  both  sides  from  0  to  t  and  using  the  initial  condition,  we  obtain 

u(t,k)  =  -L=  e-k2(t-r)-ikt  _T); 


V2 


7 r 


where  <r(s)  is  the  usual  step  function  (6.23).  Finally,  we  apply  the  inverse  Fourier  transform 
formula  (7.9),  and  then  (8.14),  to  deduce  that 


u(t,  x)  =  6?(£,  x\  r,  £)  =  — —  [  e  k  ^  T)+lk(x  0  dk 


27 r 

~  T) 

2^/n(t  -  t) 


—  oo 


exp 


(x  -  q2 

4  (t  -  t)  _ 


(8.28) 


=  a(t  -  r)F(t  -  t,x-£)  . 


Thus,  the  general  fundamental  solution  is  obtained  by  translating  the  fundamental  solution 
F(t,x]  £)  for  the  initial  value  problem  to  a  starting  time  of  t  =  r  instead  of  t  =  0.  Finally, 
the  superposition  principle  (8.24)  produces  the  solution, 


u(t,  x ) 


[*  r  Hzs 

o  7-00  2  y/n(t  -  r) 


exp 


(x  -  e2  • 
4  (t  ~  r)  _ 


dt;  dr , 


(8.29) 


to  the  heat  equation  with  source  term  and  zero  initial  condition  on  an  infinite  bar.  A 
nonzero  initial  condition  u{ 0,x)  =  f(x)  leads,  as  in  the  superposition  formula  (8.25),  to  an 
additional  term  of  the  form  (8.16)  in  the  solution  formula. 


Remark :  The  fact  that  an  initial  condition  has  the  same  aftereffect  on  the  temper¬ 
ature  as  an  instantaneous  applied  heat  source  of  the  same  magnitude,  thus  implying  the 
identification  (8.28)  of  the  two  types  of  fundamental  solution,  is  known  as  DuhameVs  Prin¬ 
ciple ,  named  after  the  nineteenth-century  French  mathematician  Jean-Marie  Duhamel. 
Duhamel’s  Principle  remains  valid  over  a  broad  range  of  linear  evolution  equations. 


Example  8.2.  An  infinitely  long  bar  with  unit  thermal  diffusivity  starts  out  uni¬ 
formly  at  zero  degrees.  Beginning  at  time  t  =  0,  a  concentrated  heat  source  of  unit 
magnitude  is  continually  applied  at  the  origin.  The  resulting  temperature  is  the  solution 
u(t,  x )  to  the  initial  value  problem 


ut  =  uxx  +  u( 0,x)  =  0,  t  >  0, 

According  to  (8.29),  the  solution  is  given  by 


—  OO  <  X  <  oo. 


u(t,  x) 


fl  r 

Jo  J-oc  2  a/7 v(t  -  r) 


exp 


(x  ~  Q2  ' 

4  (t  ~  r)  _ 


dt;  dr 


r  i 
Jo  2  / n(t  -  t) 


exp 


4  (t  ~  r) 


dr 


Three  snapshots  can  be  seen  in  Figure  8.3.  Observe  that  the  solution  is  even  in  x  and 
monotonically  decreasing  as  x  \  — oo.  Moreover,  it  has  a  corner  at  the  origin  with  limiting 
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tangent  lines  of  slopes  =b|,  which  implies  that  its  second  x  derivative  produces  the  delta- 
function  forcing  term.  At  each  time  £,  the  solution  can  be  viewed  as  the  linear  superposition 
of  a  continuous  family  of  fundamental  solutions,  corresponding  to  the  cumulative  effect  of 
individual  heat  sources  applied  at  each  previous  time  0  <  r  <  t.  Moreover,  it  is  not 
difficult  to  see  that,  at  each  fixed  x,  the  temperature  is  monotonically  increasing  in  t,  with 
u(t,x)  -T  oo  as  t  -T  oo,  and  hence  the  continuous  heat  source  eventually  produces  an 
unbounded  temperature  in  the  entire  infinite  bar. 


The  Black-Scholes  Equation  and  Mathematical  Finance 


The  most  important  and  influential  partial  differential  equation  in  financial  modeling  and 
investment  is  the  celebrated  Black-Scholes  equation 


du  cr 

~m  +  ~2 


X‘ 


d2u 

dx2 


+  rx 


du 

dx 


—  ru  —  0. 


(8.30) 


first  proposed  in  1973  by  the  American  economists  Fischer  Black  and  Myron  Scholes,  [19], 
and  Robert  Merton,  [71].  The  dependent  variable  u(t,x)  represents  the  monetary  value 
of  a  single  financial  option ,  meaning  a  contract  to  either  buy  or  sell  an  asset  at  a  specified 
exercise  price  p  at  a  certain  future  time  t*.  The  value  u(t,x)  of  the  option  will  depend 
on  the  current  time  t  <  t+  and  the  current  price  x  >  0  of  the  underlying  asset.  As  with 
many  financial  models,  one  assumes  the  absence  of  arbitrage,  meaning  that  there  is  no 
way  to  make  a  riskless  profit.  The  constant  a  >  0  represents  the  asset’s  volatility ,  while 
r  denotes  the  (assumed  fixed)  interest  rate  for  bank  deposits,  where  investors  could  place 
their  money  with  a  guaranteed  rate  of  return  instead  of  buying  the  option.  (Investors 
borrowing  money  to  buy  the  asset  would  use  a  negative  value  of  r.)  The  derivation  of 
the  Black-Scholes  equation  from  basic  financial  modeling  relies  on  the  theory  of  stochastic 
differential  equations,  [83],  which  would  take  us  too  far  afield  to  explain  here;  instead,  we 
refer  the  interested  reader  to  [123].  The  Black-Scholes  equation  and  its  generalizations 
form  the  basis  of  much  of  the  modern  financial  world,  and,  increasingly,  the  insurance 
industry. 

Observe  first  that  the  Black-Scholes  equation  is  a  backwards  diffusion  process,  since, 
upon  solving  for 


du 

~dt 


a  . 

= - x: 


d2u  du 

7—  -  rx  —  +  ru. 
dxz  dx 


(8.31) 


the  coefficient  of  the  diffusion  term  uxx  is  negative.  This  implies  that  the  initial  value 
problem  is  well-posed  only  when  time  runs  backwards.  In  other  words,  given  a  prescribed 
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value  of  the  option  at  some  specified  time  in  the  future,  we  can  use  the  Black-Scholes 
equation  to  determine  its  current  value.  However,  ill-posedness  implies  that  we  cannot 
predict  future  values  from  the  current  worth  of  the  portfolio. 

The  “final  value  problem”  for  the  Black-Scholes  equation  is  to  determine  the  option’s 
value  u(t,  x )  at  the  current  time  t  and  asset  value  x  >  0,  given  the  final  condition 

u(t *,  x)  =  f(x)  (8.32) 

at  the  exercise  time  >  t.  For  a  so-called  European  call  option ,  whereby  the  asset  is  to 
be  bought  at  the  exercise  price  p  >  0  at  the  specified  time,  the  final  condition  is 


u(£*,  x)  =  max{ x  —  p,  0 },  (8.33) 

representing  the  investor’s  profit  when  x  >  p,  or,  when  x  <  p,  the  option  not  being  exercised 
so  as  to  avoid  a  loss.  Analogously,  for  a  put  option ,  where  the  asset  is  to  be  sold,  the  final 
condition  is 

u(£*,  x)  =  max{p  —  x,  0}.  (8.34) 

The  solution  u(£,  x)  will  be  defined  for  all  t  <  t*  and  all  x  >  0,  subject  to  the  boundary 
conditions 

u(t,  0)  =  0,  u(t,x)  ^  x  as  x  — >>  oo, 

where  the  asymptotic  boundary  condition  means  that  the  ratio  u(t,  x)/x  tends  to  a  constant 
as  x  N  oo. 

Fortunately,  the  Black-Scholes  equation  can  be  solved  explicitly  by  transforming  it 
into  the  heat  equation.  The  first  step  is  to  convert  it  to  a  forward  diffusion  process,  by 
setting 

r  =  \  o2  (t+  -  £),  v(r,x)  =  u(t*  -  2t/cf2,x ), 

so  that  r  effectively  runs  forward  from  0  as  the  actual  time  t  runs  backwards  from  t*.  This 
substitution  has  the  effect  of  converting  the  final  condition  (8.32)  into  an  initial  condition 
u(0,x)  =  f{x).  Moreover,  a  straightforward  chain  rule  computation  shows  that  v  satisfies 


dv 

dr 


+  nx 


dv 

dx 


—  ft  u, 


where 


The  next  step  is  to  remove  the  explicit  dependence  on  the  independent  variable  x.  The 
hint  is  that  the  right-hand  side  has  the  form  of  an  Euler  ordinary  differential  equation, 
23,  89].  According  to  Exercise  4.3.23,  these  terms  can  be  placed  into  constant-coefficient 
form  by  the  change  of  independent  variables  x  =  ey .  Indeed,  writing 


w(r,  y)  =  v(r,ey)  =  u(r,  x)  when  x  =  ey : 
we  apply  the  chain  rule  to  compute  the  derivatives 


dw  dv  dw  dv  dv  d2w  d2v  „  dv  9  d2v  dv 

_  —  _  _  —  pV  _  —  ry  -  -  -  IJ _ I  plJ  _  —  _ l_  ry  _ 

dr  dr  ’  dy  dx  dx  ’  dy2  dx2  dx  dx2  dx 

As  a  result,  we  hnd  that  w  solves  the  partial  differential  equation 


dw 

dr 


n  w. 


(8.35) 


This  is  getting  closer  to  the  heat  equation,  and,  in  fact,  can  be  changed  into  it  by  setting 


cxT+P  y 


w(r,y)  =  e 


W,  y) 
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for  suitable  constants  a,  /?.  Indeed,  differentiating  and  substituting  into  (8.35)  yields 


dz  d2z  ~ndz  2  (9z 

T  Oi  z  —  — — 77  T  2  (3  — —  T/3  z  -\-  (hi  —  1)  (  — —  T  /3  z 


dr 


dy 


dy 


dy 


—  KZ. 


The  terms  involving  dz/dy  and  £  are  eliminated  by  setting 


a  —  —  \  T  1)' 


P  =  -\{k-  !) 


(8.36) 


We  conclude  that  the  function 


satisfies  the  heat  equation 


z(r,y)  =  e^+1^  r/4+(K  1^/2re(r,  y) 


dz  d2z 


(8.37) 


dr  dy2 

Unwinding  the  preceding  argument,  we  have  managed  to  prove  the  following: 
Proposition  8.3.  If  z(r,y)  is  the  solution  to  the  initial  value  problem 


(8.38) 


dz  d2z 


dr  dy2  ’ 

for  r>0,  —  oo  <  y  <  oo,  then 


z(0,  y)  =  h(y)  =  e(K  1)y/2f(ey), 


(8.39) 


(t,x)  =  a,- C~-D/2e- («+D3 z^a2^  -t),logx) 


u 


(8.40) 


solves  the  hnal  value  problem  (8.30,  32)  for  the  Black-Scholes  equation  for  t  <  t*  and 

0  <  x  <  oo. 

Now,  according  to  (8.16),  the  solution  to  the  initial  value  problem  (8.39)  can  be  written 
as  a  convolution  integral  of  the  initial  data  with  the  heat  equation’s  fundamental  solution: 


y) 


1 


■OO 


2  a  JWt  J_ 


oo 


e  /(4r)  h(rj)  dr)  =  — - —  /  e~(y~y)  /(4r)+(^-1)7?/2  f(ev)  drj. 

2  Vn  T  J- oo 


(8.41) 

Combining  this  formula  with  (8.40)  produces  an  explicit  solution  formula  for  the  general 
final  value  problem  for  the  Black-Scholes  equation.  In  particular,  for  the  European  call 
option  (8.33),  the  initial  condition  is 


z( 0,  y)  =  h(y)  =  e ^  x^yl2  max{  ey  —  p,  0  }, 


and  so 


‘OO 


z(r,y)  = 


e  ^ v  ^  /(4r)+(K  1)??/2(g??  —  p)  dym 


2  \Jr  r  j  \Qgp 

The  integral  can  evaluated  by  completing  the  square  inside  the  exponential,  producing 


zfay)  =  g 


,0+1)2t/4+(k+1)?//2 


f  logp-  («  +  l)r-y\ 

V  2^  J 


_pe(K-i)2r/4+(.-i)y/2erfc  /logff-tv-l)r-y\ 


(8.42) 
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where 


erfc  x  = 


dz  —  1  —  erf  x 


(8.43) 


is  the  complementary  error  function,  cf.  (2.87).  Substituting  (8.42)  into  (8.40)  results  in 
the  celebrated  Black-Scholes  formula  for  a  European  call  option: 


u(t,x )  =  - 


xerfc  — 


( r  +  \  o-2  )  X  -  t)  +  log (x/p) 

Z2a2X~t) 

—  p  e~  rl't*  erfc  I  — 


( r  ~  N2 )  (e  ~  p + lQg  (x/p) 

\/2^2(e  -t) 


(8.44) 


A  graph  of  the  solution  for  the  specific  values  t*  =  10,  r  =  .1,  a  =  .2 ,  p  —  10  appears  in 
Figure  8.4.  Observe  that  the  option’s  value  slowly  decreases  as  the  time  gets  closer  and 
closer  to  the  exercise  time  £*,  thereby  lessening  any  chances  of  further  profit  stemming 
from  the  option’s  underlying  price  volatility. 
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Exercises 


8.1.1.  Find  the  solution  to  the  heat  equation  ut  =  uxx  on  the  real  line  having  the  following 
initial  condition  at  time  t  =  0.  Then  sketch  graphs  of  the  resulting  temperature  distribution 
at  times  t  =  0,1,  and  5. 


(a) 


—  X 


(b)  the  step  function  cr(x),  (c)  e 


—  X 


(d) 


X 


X 


<  1. 


otherwise. 


8.1.2.  On  an  infinite  bar  with  unit  thermal  diffusivity,  a  concentrated  unit  heat  source  is  in¬ 
stantaneously  applied  at  the  origin  at  time  t  —  0.  A  heat  sensor  measures  the  resulting 
temperature  in  the  bar  at  position  x  =  1.  Determine  the  maximum  temperature  measured 
by  the  sensor.  At  what  time  is  the  maximum  achieved? 

8.1.3.  (a)  Find  the  solution  to  the  heat  equation  (8.6)  whose  initial  data  corresponds  to  a  pair 

of  unit  heat  sources  placed  at  positions  x  =  ±1.  (b)  Graph  the  solution  at  times  t  = 

.1,  .25,  .5,  1.  (c)  At  what  time(s)  does  the  origin  experience  its  maximum  overall  tempera¬ 
ture?  What  is  the  maximum  temperature  at  the  origin? 

8.1.4.  (a)  Use  the  Fourier  transform  to  solve  the  initial  value  problem 

d^u  / 

u(0,  x)  =  5  (x  —  £),  —oo  <  x  <  co,  t  >  0, 


du 


dt  dx 2 

whose  initial  data  is  the  derivative  of  the  delta  function  at  a  fixed  position  £. 

(b)  Show  that  your  solution  can  be  written  as  the  derivative  dF/dx  of  the  fundamental 
solution  F(£,  #;£).  Explain  why  this  observation  should  be  valid. 

8.1.5.  Suppose  that  the  initial  data  u{ 0,x)  =  f(x)  is  real.  Explain  why  the  Fourier  transform 
solution  formula  (8.13)  defines  a  real  function  u(t,x)  for  all  t  >  0. 

8.1.6.  (a)  What  is  the  maximum  value  of  the  fundamental  solution  at  time  tl 
(b)  Can  you  justify  the  claim  that  its  width  is  proportional  to  y/t  ? 

8.1.7.  Prove  directly  that  (8.5)  is  indeed  a  solution  to  the  heat  equation,  and,  moreover,  has 
the  correct  initial  and  boundary  conditions. 

8.1.8.  Show,  by  a  direct  computation,  that  the  final  formula  in  (8.14)  is  a  solution  to  the  heat 
equation  for  all  t  >  0. 

0  8.1.9.  Justify  formula  (8.15). 

8.1.10.  According  to  Exercises  4.1.11-12,  both  the  t  and  x  partial  derivatives  of  the  fundamen¬ 
tal  solution  solve  the  heat  equation,  (a)  Write  down  the  initial  value  problem  satisfied  by 
these  two  solutions,  (b)  Set  £  =  0  and  then  sketch  graphs  of  each  solution  at  several 
selected  times,  (c)  Reconstruct  each  solution  as  a  Fourier  integral. 

OF 

8.1.11.  Let  u(t,x)  =  -7—  (t,x;  0)  denote  the  x  derivative  of  the  fundamental  solution  (8.14). 

ox 

(a)  Prove  that  u(t,  x)  is  a  solution  to  the  heat  equation  ut  =  uxx  on  the  domain 

{  —  oo  <  x  <  co,  t  >  0}.  (b)  For  fixed  x,  prove  that  lim  u(t,x)  =  0.  (c)  Explain  why, 

t  — y  0+ 

despite  the  results  in  parts  (a)  and  (b),  u(t,x)  is  not  a  classical  solution  to  the  initial  value 
problem  ut  =  uxx,  u(0,  x)  =  0.  What  is  the  classical  solution?  (d)  What  initial  value 

problem  does  u(t,  x)  satisfy? 

8.1.12.  Justify  all  the  statements  in  Example  8.2. 

C  8.1.13.  (a)  Solve  the  heat  equation  on  an  infinite  bar  when  the  initial  temperature  is  equal  to  1 
for  |  x  |  <  1  and  0  elsewhere,  while  a  unit  heat  source  is  applied  to  the  same  part  of  the  bar 
x  |  <  1  for  a  unit  time  period  0  <  t  <  1.  (b)  At  what  time  and  what  location  is  the  bar 
the  hottest?  (c)  What  is  the  final  equilibrium  temperature  of  the  bar? 
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8.1.14.  An  insulated  bar  1  meter  long,  with  constant  diffusivity  7  =  1,  is  taken  from  a  freezer 
that  is  kept  at  —10°  C,  and  then  has  its  ends  kept  at  room  temperature  of  20°  C.  A 
soldering  iron  with  temperature  350°  C  is  continually  held  at  the  midpoint  of  the  bar. 

(a)  Set  up  an  initial  value  problem  modeling  the  temperature  distribution  in  the  bar. 

(b)  Find  the  corresponding  equilibrium  temperature  distribution. 

T  8.1.15.  Consider  the  heat  equation  with  unit  thermal  diffusivity  on  the  interval  0  <  x  <  1 
subject  to  homogeneous  Dirichlet  boundary  conditions. 

(a)  Find  a  Fourier  series  representation  for  the  fundamental  solution  F(£,x;£)  that  solves 
the  initial-boundary  value  problem 

ut  ~  uxxi  t  >  0,  0  <  x  <  1,  a(0,  x)  =  S(x  —  £),  u(t,  0)  =  0  =  u(t,  1). 

Your  solution  should  depend  on  t,  x  and  the  point  £  where  the  initial  delta  impulse  is 
applied. 

(b)  For  the  value  £  =  .3,  use  a  computer  program  to  sum  the  first  few  terms  in  the  series 
and  graph  the  result  at  times  t  =  .0001,  .001,  .01,  and  .1.  Make  sure  you  have  included 
enough  terms  to  obtain  a  reasonably  accurate  graph. 

(c)  Compare  your  graphs  with  those  of  the  fundamental  solution  Fft,  x\  .3)  on  an  infinite 
interval  at  the  same  times.  What  is  the  maximum  deviation  between  the  two  solutions 
on  the  entire  interval  0  <  x  <  1? 

(d)  Use  your  fundamental  solution  F(£,x;£)  to  construct  a  series  solution  to  the  general 
initial  value  problem  u( 0,x)  =  f(x).  Is  your  series  the  same  as  the  usual  Fourier  series 
solution?  If  not,  explain  any  discrepancy. 

8.1.16.  True  or  false:  Periodic  forcing  of  the  heat  equation  at  a  particular  frequency  can 
produce  resonance.  Justify  your  answer. 

8.1.17.  Find  the  fundamental  solution  for  the  cable  equation  vt  =  "yvxx  —  av  on  the  real  line. 
Hint:  See  Exercise  4.1.16. 

8.1.18.  The  partial  differential  equation  ut+cux  =  7  uxx  models  transport  of  a  diffusing 
pollutant  in  a  fluid  flow.  Assuming  that  the  speed  c  is  constant,  write  down  a  solution  to 
the  initial  value  problem  u( 0,  x)  =  f(x)  for  —  00  <  x  <  00.  Hint:  Look  at  Exercise  4.1.17. 

0  8.1.19.  Use  the  Fourier  transform  to  solve  the  initial  value  problem  i  ut  =  uxx,  u( 0,  x)  =  /(#), 
for  the  one-dimensional  Schrodinger  equation  on  the  real  line  —  00  <  x  <  00. 

(f  8.1.20.  Let  u(t,x)  be  a  solution  to  the  heat  equation  having  finite  thermal  energy, 

/oo 

uft,  x )  dx  <  00,  and  satisfying  u  ft,  x)  0  as  x  ±00,  for  all  t  >  0.  Prove  the 

-co 

law  of  conservation  of  thermal  energy:  Eft )  =  constant. 

8.1.21.  Explain  in  your  own  words  how  a  function  u(t,x)  can  satisfy  uft,x)  — 0  uniformly  as 

/CO 

uft,  x)  dx  =  1  for  all  t.  Discuss  what  this 

-co 

signifies  regarding  the  interchange  of  limits  and  integrals. 

XV.  C\  7  2  XV. 

8.1.22.  (a)  Prove  that  if  ffk)  G  L  is  square-integrable,  then  so  is  e_a  ffk)  for  any  a  >  0. 

(b)  Prove  that  when  the  initial  data  ffx)  G  L2  is  square  integrable,  so  is  the  Fourier 
integral  solution  (8.13)  for  all  t  >  0. 

8.1.23.  Find  the  solution  to  the  Black-Scholes  equation  for  a  put  option  (8.34). 

8.1.24.  (a)  If  we  increase  the  interest  rate  r,  does  the  value  of  a  call  option  (i)  increase; 

(ii)  decrease;  (Hi)  stay  the  same;  (iv)  could  do  any  of  the  above?  Justify  your  answer, 
(b)  Answer  the  same  question  when  rate  stays  fixed,  but  the  volatility  cr  is  increased. 

0  8.1.25.  Justify  formula  (8.42). 
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The  geometric  approach  to  partial  differential  equations  enables  one  to  exploit  their  sym¬ 
metry  properties  to  construct  explicit  solutions  of  both  mathematical  and  physical  interest. 
Unlike  separation  of  variables,  which  is  restricted  to  special  types  of  linear  partial  differ¬ 
ential  equations,’* *'  symmetry  methods  can  also  be  successfully  applied  to  a  broad  range  of 
nonlinear  partial  differential  equations.  While  we  do  not  have  the  mathematical  tools  to 
develop  the  full  range  of  symmetry  techniques,  we  will  learn  how  to  exploit  some  of  the 
most  basic  symmetry  properties:  translations,  leading  to  traveling  wave  solutions;  scalings, 
leading  to  similarity  solutions;  and,  in  subsequent  chapters,  rotational  symmetries. 

In  general,  by  a  symmetry  of  an  equation,  we  mean  a  transformation  that  takes  so¬ 
lutions  to  solutions.  Thus,  knowing  a  symmetry  transformation,  if  we  are  in  possession  of 
one  solution,  then  we  can  construct  a  second  solution  by  applying  the  symmetry.  And, 
possibly,  a  third  solution  by  applying  the  symmetry  yet  again.  And  so  on.  If  we  know  lots 
of  symmetries,  then  we  can  produce  lots  of  solutions  by  this  simple  device. 


Remark :  General  symmetry  techniques  are  founded  on  the  theory  of  Lie  groups , 
named  after  the  influential  nineteenth-century  Norwegian  mathematician  Sophus  Lie  (pro¬ 
nounced  “Lee”).  Lie’s  theory  is  a  profound  synthesis  of  group  theory  and  differential 
geometry,  and  provides  an  algorithm  for  completely  determining  all  the  (continuous)  sym¬ 
metries  of  a  given  differential  equation.  Although  the  theory  lies  beyond  the  scope  of 
this  introductory  text,  direct  inspection  and/or  physical  intuition  will  often  produce  the 
most  important  symmetries  of  the  system,  which  can  then  be  directly  exploited.  Modern 
applications  of  Lie’s  symmetry  methods  to  partial  differential  equations  arising  in  physics 
and  engineering  can  be  traced  back  to  an  influential  book  on  hydrodynamics  by  the  au¬ 
thor’s  thesis  advisor,  Garrett  Birkhoff,  [17].  A  complete  and  comprehensive  treatment 
of  Lie  symmetry  methods  can  be  found  in  the  author’s  first  book  [87],  and,  at  a  more 
introductory  level,  in  the  recent  books  [27,  58],  the  first  having  a  particular  emphasis  on 
applications  in  fluid  mechanics. 


The  heat  equation  serves  as  an  excellent  testing  ground  for  the  general  methodology, 
since  it  admits  a  rich  variety  of  symmetry  transformations  that  take  solutions  to  solutions. 
The  simplest  are  the  translations.  Moving  the  space  and  time  coordinates  by  a  fixed 
amount, 

t  i — >  t  +  a,  x  i — >  x  +  6,  (8.45) 

where  a,  b  are  constants,  changes  the  function  u(t,  x)  into  the  translated  function* 

U  (£,  x)  =  u(t  —  a,  x  —  b ).  (8.46) 

A  simple  application  of  the  chain  rule  proves  that  the  partial  derivatives  of  U  with  respect 
to  t  and  x  agree  with  the  corresponding  partial  derivatives  of  n,  so 

dU  _du  dU  _du  d2U  _  d2u 

dt  dt  ’  dx  dx  ’  dx2  dx 2  ’ 


*  This  is  not  entirely  fair:  separation  of  variables  can  also  be  applied  to  certain  nonlinear 
partial  differential  equations  such  as  Hamilton- Jacobi  equations,  [73], 

*  The  minus  signs  arise  because  when  we  set  t  =  t  +  a,  x  =  x  +  5,  then  the  translated  function 
is  U(t,x)  =  u(t,  x)  =  u(t  —  a,  x  —  b).  Dropping  the  hats  produces  the  stated  formula. 
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and  so  on.  In  particular,  the  function  U (£,  x)  is  a  solution  to  the  heat  equation  Ut  —  7  Uxx 
whenever  u(t,x)  also  solves  ut  =  7  uxx.  Physically,  translation  symmetry  formalizes  the 
property  that  the  heat  equation  models  a  homogeneous  medium,  and  hence  the  solution 
does  not  depend  on  the  choice  of  reference  point  or  origin  of  our  coordinate  system. 

As  a  consequence,  each  solution  to  the  heat  equation  will  produce  an  infinite  family 
of  translated  solutions.  For  example,  starting  with  the  separable  solution 

u(t,  x )  =  e_7t  sin  x, 

we  immediately  produce  the  additional  translated  solutions 

U (£, x)  =  W-°9  sin(x  —  6), 

valid  for  any  choice  of  constants  a,  b. 

Warning :  Typically,  the  symmetries  of  a  differential  equation  do  not  respect  initial 
or  boundary  conditions.  For  instance,  if  u(t,  x)  is  defined  for  t  >  0  and  in  the  domain 
0  <  x  <  t,  then  its  translated  version  (8.46)  is  defined  for  t  >  a  and  in  the  translated 
domain  b  <  x  <  i  +  6,  and  so  will  solve  a  translated  initial-boundary  value  problem. 

A  second  important  class  of  symmetries  consists  of  the  scaling  invariances.  We  already 
know  that  if  u(t,  x)  is  a  solution,  then  so  is  the  scalar  multiple  cu(t,  x)  for  any  constant  c; 
this  is  a  simple  consequence  of  linearity  of  the  heat  equation.  We  can  also  add  an  arbitrary 
constant  to  the  temperature,  noting  that 

[/(£,  x)  =  cu(t,  x)  +  k  (8.47) 

is  a  solution  for  any  choice  of  constants  c,  k.  Physically,  the  transformation  (8.47)  amounts 
to  a  change  in  the  scale  used  to  measure  temperature.  For  instance,  if  u  is  measured  in 
degrees  Celsius,  and  we  set  c  =  |  and  k  —  32,  then  U  =  |  n+32  will  be  measured  in  degrees 
Fahrenheit.  Thus,  reassuringly,  the  physical  processes  described  by  the  heat  equation  do 
not  depend  on  our  choice  of  thermometer. 

More  interestingly,  suppose  we  rescale  the  space  and  time  variables: 


t  1 — >  at ,  x  1 — >  (3  x ,  (8.48) 

where  a,  (3  ^  0  are  nonzero  constants.  The  effect  of  such  a  scaling  transformation  is  to 
convert  u(t,  x)  into  a  rescaled  function^ 


U(t1x)  =  u(a  1t,(3  1  x). 

The  derivatives  of  U  are  related  to  those  of  u  according  to  the  formulas 


(8.49) 


dU  _  1  du  dU  _  1  du  d2U  _  1  d2u 

dt  a  dt  dx  (3  dx1  dx 2  f3 2  dx 2 


Therefore,  if  u  satisfies  the  heat  equation  ut  = 
equation 


U  =  —  u  =  —  u 
Ut  aUf  a Uxx 


7  uxx,  then  U  satisfies  the  rescaled  heat 


U 


XX  1 


As  before,  setting  t  =  at,  x  =  /3  x, 
1  1 

£,  f3~  x),  and  we  then  drop  the  hats. 


produces  the  rescaled  function  U(t,x) 


u(t ,  x)  = 


8.2  Symmetry  and  Similarity 


307 


which  we  rewrite  as 

Ut=TUxx,  where  r  =  ^  .  (8.50) 

Thus,  the  net  effect  of  scaling  space  and  time  is  merely  to  rescale  the  diffusion  coefficient. 
Physically,  the  scaling  symmetry  (8.48)  corresponds  to  a  change  in  the  physical  units  used 
to  measure  time  and  distance.  For  instance,  to  change  from  minutes  to  seconds,  set  a  =  60, 
and  from  yards  to  meters,  set  (3  =  .9144.  The  net  effect  (8.50)  on  the  diffusion  coefficient 
7  is  a  reflection  of  its  physical  units,  namely  distance2 /time. 

In  particular,  if  we  choose 


ol  =  7,  /3  =  1, 

then  the  rescaled  diffusion  coefficient  becomes  T  =  1.  This  observation  has  the  following 
important  consequence.  If  U(t,x)  solves  the  heat  equation  for  a  unit  diffusivity,  T  =  1, 
then 

u(t,x)  =  U('yt,x)  (8.51) 

solves  the  heat  equation  for  the  diffusivity  7  >  0.  Thus,  the  only  effect  of  the  diffusion 
coefficient  is  to  speed  up  or  slow  down  time.  A  body  with  diffusivity  7  =  2  will  cool  down 
twice  as  fast  as  a  body  (of  the  same  shape  subject  to  similar  boundary  conditions  and  initial 
conditions)  with  diffusivity  7  =  1.  Note  that  this  particular  rescaling  has  not  altered  the 
space  coordinates,  and  so  U(t,x)  is  defined  on  the  same  spatial  domain  as  u(t,x). 

On  the  other  hand,  if  we  set  a  =  /32,  then  the  rescaled  diffusion  coefficient  is  exactly 
the  same  as  the  original:  T  =  7.  Thus,  the  transformation 

t  1 — >  f32  £,  x  1 — >  /3x,  (8.52) 

does  not  alter  the  equation,  and  hence  defines  a  scaling  symmetry  —  also  known  as  a  sim¬ 
ilarity  transformation  —  for  the  heat  equation.  Combining  (8.52)  with  the  linear  rescaling 
u  h-»  cu,  we  make  the  elementary,  but  important,  observation  that  if  u(t,  x )  is  any  solution 
to  the  heat  equation,  then  so  is  the  function 

[/(£,  x)  =  cn(/3-2  £,  /3-1  x),  (8.53) 

for  the  same  diffusion  coefficient  7.  For  example,  rescaling  the  solution 

u(t,  x)  =  e_7t  cos  x  leads  to  the  solution  [/(£,  x)  =  c  cos  —  . 

Warning :  As  in  the  case  of  translations,  rescaling  space  by  a  factor  f3  ^  1  will  alter 
the  domain  of  definition  of  the  solution.  If  u(£,  x)  is  defined  for  a  <  x  <  6,  then  U (£,  x),  as 
given  in  (8.53),  is  defined  for  [3  a  <  x  <  f3  b  (or,  when  (3  <  0,  for  [3  b  <  x  <  /?  a). 

For  example,  suppose  that  we  have  solved  the  heat  equation  for  the  temperature  u(t,  x) 
on  a  bar  of  length  1,  subject  to  certain  initial  and  boundary  conditions.  We  are  then  given 
a  bar  composed  of  the  same  material  of  length  2.  Since  the  diffusivity  coefficient  has  not 
changed,  we  can  directly  construct  the  new  solution  U (£,  x)  by  rescaling.  Setting  (3  =  2 
will  serve  to  double  the  length.  If  we  also  rescale  time  by  a  factor  a  =  (32  =  4,  then  the 
rescaled  function  U(t,x)  =  will  be  a  solution  of  the  heat  equation  on  the  longer 

bar  with  the  same  diffusivity  constant.  The  net  effect  is  that  the  rescaled  solution  will  be 
evolving  four  times  as  slowly  as  the  original,  and  hence  it  effectively  takes  a  bar  that  is 
twice  the  length  four  times  as  long  to  cool  down. 
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Similarity  Solutions 

A  similarity  solution  of  a  partial  differential  equation  is  one  that  remains  unchanged  (in¬ 
variant)  under  a  one-parameter  family*  of  scaling  symmetries.  For  a  partial  differential 
equation  in  two  variables  —  say  t  and  x  —  the  similarity  solutions  can  be  found  by  solving 
an  ordinary  differential  equation. 

Suppose  our  partial  differential  equation  admits  the  scaling  symmetries 

t  i — >  /3a£,  x  i — >  /3b  x,  u  i — >  /3ciq  /?  7^  0,  (8.54) 

where  a,  6,  c  are  fixed  constants  with  a,  b  not  both  zero.  As  above,  this  means  that  if  u(t,  x) 
is  a  solution  to  the  differential  equation,  so  is  the  rescaled  function 

U (£,  x)  =  (3C  u(f3~a  t,  fd~b  x)  (8.55) 

for  all  values  of  [3  0.  Checking  that  this  indeed  defines  a  symmetry  is  a  simple  matter  of 

applying  the  chain  rule,  which  implies  that  the  derivatives  scale  according  to 

Ut  ' — >  X~aut,  ux  1 — >  Pc~bux,  utt  1 — >  f3c~2autt,  uxt  1 — »  X~a~buxt,  (8.56) 

and  so  on.  Products  of  derivatives  scale  multiplicatively,  e.g.,  xA  uuxt  h-t  f32c~a+Ab  xA  uuxt. 
In  order  that  a  (polynomial)  differential  equation  admit  such  a  scaling  symmetry,  each  of 
its  terms  must  scale  by  the  same  overall  power  of  f3. 

By  definition,  u(t,  x)  is  called  a  similarity  solution  if  it  remains  unchanged  (invariant) 
under  the  scaling  symmetries  (8.54),  so  that 

u{t,  x)  =  f3c  u(/3~a  t,  /3~b  x)  (8.57) 

for  all  (3  >  0.  Let  us,  for  specificity,  assume  that  a  /  0,  leaving  the  case  a  =  0,  b  ^  0, 
for  the  reader  to  complete  in  Exercise  8.2.13.  Since  the  left-hand  side  of  (8.57)  does  not 
depend  on  ft  we  can  fix  its  value  to  be*  /3  =  and  conclude  that  the  similarity  solution 
must  have  the  form 

u(t,  x)  =  tc/a  -?;(£),  where  f)=xt~b^a  and  -?;(£)=  ^(1,  £),  (8.58) 

are  referred  to  as  the  similarity  variables ,  since  they  remain  invariant  when  subjected  to 
the  scaling  transformations  (8.54).  We  then  use  the  chain  rule  to  find  the  formulas  for  the 
partial  derivatives  of  u  in  terms  of  the  ordinary  derivatives  of  v  with  respect  to  £.  Substi¬ 
tuting  these  expressions  into  the  scale-invariant  partial  differential  equation  for  u(t,  x),  and 
then  canceling  a  common  factor  of  t,  will  effectively  reduce  it  to  an  ordinary  differential 
equation  for  the  function  v(f).  Each  solution  to  the  resulting  ordinary  differential  equa¬ 
tion  then  gives  rise  to  a  scale-invariant  solution  to  the  original  partial  differential  equation 
through  the  similarity  ansatz  (8.58). 

Example  8.4.  As  a  first  example,  let  us  return  to  the  nonlinear  transport  equation 

ut  +  uux  =  0,  (8.59) 


*  Or,  more  accurately,  a  one-parameter  group,  [87], 
This  assumes  t  >  0;  for  t  <  0,  just  replace  t  by  —  t. 
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which  we  studied  in  Section  2.3.  Under  (8.54,56),  the  equation  rescales  to 

f5c~aut  +  f32c~buux  =  0, 

which  is  unchanged,  provided  c  —  a  =  2c  —  b1  and  hence  c  —  b  —  a.  Setting  a  —  1,  c  —  b  —  1, 
we  conclude  that  if  n(t,  x)  is  any  solution,  then  so  is  the  rescaled  function 

U (£,  x)  =  /35-1  n(/3_1  £,  (3~b  x) 

for  any  b  and  any  /?  ^  0. 

To  find  the  associated  similarity  solutions,  we  use  (8.58)  to  introduce  the  ansatz 

u(t,  x)  =  £5-1  u(£),  where  £  —  xt~b .  (8.60) 

Differentiating,  we  obtain 


ut  =  —bxt  V(£)  +  (6-  1)C  2v(C=tb  +  (6- l)t>(£)]>  ^'(O- 

Substituting  these  expressions  into  the  transport  equation  (8.59)  yields 


0  =  ut  +  uux  =  tb  2  [(u  —  &£)  U  +  (6  —  1)  v  , 

and  so 

dv 

(v  —  b £)  —  +  (6  —  1)  u  =  0.  (8.61) 

dq 

Any  solution  to  this  nonlinear  first-order  ordinary  differential  equation  will,  when  substi¬ 
tuted  into  (8.60),  produce  a  similarity  solution  to  the  nonlinear  transport  equation. 

If  5  =  1,  then  either  v  =  &£,  producing  the  particular  similarity  solution  u(t,x)  =  x/t 
that  we  earlier  used  to  construct  the  rarefaction  wave  (2.54),  or  v  is  constant,  and  so  is  u. 
Otherwise,  we  can,  in  fact,  linearize  (8.61)  by  treating  £  as  a  function  of  u,  whence 


(b—  1)  v  —  b^  =  —  v. 
dv 

The  general  solution  to  such  a  linear  first-order  ordinary  differential  equation  is  found  by 
the  standard  method,  [18,  23],  resulting  in 


^  =  v  +  kvb^b~1) 


where  k  is  the  constant  of  integration.  Recalling  (8.60),  we  find  that  the  similarity  solutions 
u(t,  x )  are  defined  by  an  implicit  equation 

x  =  +tu. 


For  example,  if  b  =  2,  the  (multi-valued)  solution  is  a  sideways-moving  parabola: 


n 

x  =  ku  +  tu. 


so  that 


u  = 


—  t  ±  y/t 2  +4  k 


X 


2k 


Example  8.5.  Consider  the  linear  heat  equation 


ut  =  uxx.  (8.62) 

Under  the  rescaling  (8.54),  the  equation  becomes  /3c~aut  =  /3c~2buxx,  and  thus  (8.54) 
represents  a  symmetry  if  and  only  if  a  =  2b.  Therefore,  if  u(t,x)  is  any  solution,  so  is  the 
rescaled  function 


U (£,  x)  =  f3c  u(/3  2t,/3  1  x). 
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Of  course,  the  initial  scaling  factor  stems  from  the  linearity  of  the  equation. 
The  scale-invariant  solutions  are  constructed  through  the  similarity  ansatz 


u(t,  x)  =  £c/2  u(£),  where  £  =  x/Vt 


Differentiation  yields 


u 


ut  =  -\xtc!2  3/2  v\i)  +  ±cic/2  1v{£)=tc/2  1 

=  ic/2-V'(0. 


+  \cv(£) 


ry  ry 


Substituting  these  expressions  into  the  heat  equation  and  canceling  a  common  power  of  t. 
we  find  that  v  must  satisfy  the  linear  ordinary  differential  equation 


v"  +  h^v'  ~  h  cv  =  0- 


(8.63) 


If  c  =  0,  then  (8.63)  is  effectively  a  linear  first-order  ordinary  differential  equation  for  i/(£) 
which  can  be  readily  solved  by  the  usual  method,  thereby  producing  the  solution 


HO  =  ci  +c2erf(iq, 

where  c1?  c2  are  arbitrary  constants  and  erf  is  the  error  function  (2.87).  The  corresponding 
similarity  solution  to  the  heat  equation  is 


u(t,  x)  =  c1  +  c2  erf 

The  error  function  solutions  that  we  encountered  in  (8.17)  can  be  built  up  as  a  linear 
combination  of  translations  of  this  similarity  solution. 

If  c  7^  0,  most  solutions  to  the  ordinary  differential  equation  (8.63)  are  not  elementary 
functions. t  One  is  in  need  of  more  sophisticated  techniques,  e.g.,  the  method  of  power 
series  to  be  developed  in  Section  11.3,  to  understand  its  solutions,  and  hence  the  resulting 
similarity  solutions  to  the  heat  equation. 


Exercises 

8.2.1.  If  it  takes  a  2  cm  long  insulated  bar  23  minutes  to  cool  down  to  room  temperature,  how 
long  does  it  take  a  4  cm  bar? 

8.2.2.  If  it  takes  a  5  centimeter  long  insulated  iron  bar  10  minutes  to  cool  down  so  as  not  to 
burn  your  hand,  how  long  does  it  take  a  20  centimeter  bar  made  out  of  the  same  material 
to  cool  down  to  the  same  temperature? 

8.2.3.  (a)  Given  7  >  0,  use  a  scaling  transformation  to  write  down  the  formula  for  the  funda¬ 
mental  solution  for  the  general  heat  equation  ut  =  ^uxx  for  x  G  M.  (b)  Write  down  the 
corresponding  integral  formula  for  the  solution  to  the  initial  value  problem. 


According  to  [87;  Example  3.3],  the  general  solution  can  be  written  in  terms  of  parabolic 
cylinder  functions,  [86], 
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8.2.4.  Use  scaling  to  construct  the  series  solution  for  a  heated  circular  ring  of  radius  r  and 
thermal  diffusivity  7.  Does  scaling  also  give  the  correct  formulas  for  the  Fourier  coefficients 
in  terms  of  the  initial  temperature  distribution? 

8.2.5.  A  solution  u(t,  x)  to  the  heat  equation  is  measured  in  degrees  Fahrenheit.  What  is  the 
corresponding  temperature  in  degrees  Kelvin?  Which  symmetry  transformation  takes  the 
first  solution  to  the  second  solution,  and  how  does  it  affect  the  diffusion  coefficient? 

8.2.6.  Is  time  reversal,  t  ^  —  t,  a  symmetry  of  the  heat  equation?  Write  down  a  physical  expla¬ 
nation,  and  then  a  mathematical  justification. 

8.2.7.  According  to  Exercise  4.1.17,  the  partial  differential  equation  ut  +  cux  =  7 uxx  models 
diffusion  in  a  convective  flow.  Show  how  to  use  scaling  to  place  the  differential  equation  in 

_ I 

the  form  uf-\-u  =  P  u  ,  where  P  is  called  the  Peclet  number ,  and  controls  the  rate  of 


mixing.  Is  there  a  scaling  that  will  reduce  the  problem  to  the  case  P  =  1? 

8.2.8.  Suppose  you  know  a  solution  iU(£,x)  to  the  heat  equation  that  satisfies  u*( l,x)  =  /(#). 
Explain  how  to  solve  the  initial  value  problem  with  u{ 0,  x)  =  f{x). 

8.2.9.  Solve  the  following  initial  value  problems  for  the  heat  equation  ut  =  uxx  for  xGl: 

(a)  tz(0,  x)  =  e~x  /4.  Hint :  Use  Exercise  8.2.8.  (b)  u( 0,x)  =  e_4x  . 

(c)  u( 01x)  =  x‘2e~x  /4.  Hint :  Use  Exercise  4.1.12. 

8.2.10.  Define  the  functions  H  (x)  for  n  —  0, 1,  2,  . . .  ,  by  the  formula 


d 


n 


x 


dx 


n 


(-1  )nHn(x)e 


X 


(8.64) 


(a)  Prove  that  Hn(x)  is  a  polynomial  of  degree  n,  known  as  the  nth  Hermite  polynomial. 

(b)  Calculate  the  first  four  Hermite  polynomials. 

(c)  Assuming  7  =  1,  find  the  solution  to  the  heat  equation  for  —  00  <  x  <  00  and  t  >  0, 

_  2 

given  the  initial  data  a(0,  x)  =  Hn  (x)  e  x  .  Hint :  Combine  Exercises  4.1.11,  8.2.8. 


8.2.11.  Find  the  scaling  symmetries  and  corresponding  similarity  solutions  of  the  following 


partial  differential  equations:  (a)  ut  =  x‘ 


u 


X  ’ 


(b)  ut  +  uux  =  0,  (c)  u 


tt 


u 


ry*  ry*  ' 


8.2.12.  Show  that  the  wave  equation  utt  =  cuxx  has  the  following  invariance  properties:  if 
u(t,x)  is  a  solution,  so  is  (a)  any  time  translate:  u(t  —  a,x),  where  a  is  fixed;  (b)  any  space 
translate:  u(t,x  —  6),  where  b  is  fixed;  (c)  the  dilated  function  u(/3t,  fix)  for  f3  7^  0;  (d)  any 

derivative:  say  du/dx  or  d2u/dt2 ,  provided  u  is  sufficiently  smooth. 

^  8.2.13.  Suppose  a  =  0,  b  7^  0  in  the  scaling  transformation  (8.57). 

(a)  Discuss  how  to  reduce  the  partial  differential  equation  to  an  ordinary  differential 
equation  for  the  corresponding  similarity  solutions. 


(b)  Illustrate  your  method  with  the  partial  differential  equation  tut  —  uu 


ryt  * 


8.2.14.  True  or  false:  (a)  A  homogeneous  polynomial  solution  to  a  partial  differential  equation 
is  always  a  similarity  solution,  (b)  An  inhomogeneous  polynomial  solution  to  a  partial  dif¬ 
ferential  equation  can  never  be  a  similarity  solution. 

8.2.15.  (a)  Find  all  scaling  symmetries  of  the  two-dimensional  Laplace  equation  u  +  u  =  0. 


yy 

(b)  Write  down  the  ordinary  differential  equation  for  the  similarity  solutions,  (c)  Can  you 
find  an  explicit  formula  for  the  similarity  solutions?  Hint :  Look  at  Exercise  8.2.14(a). 

C  8.2.16.  Besides  the  translations  and  scalings,  Lie  symmetry  methods,  [87],  produce  two  other 
classes  of  symmetry  transformations  for  the  heat  equation  ut  =  uxx.  Given  that  u(t,x)  is  a 

solution  to  the  heat  equation: 

2 1— 

(a)  Prove  that  U(t,x)  =  ec  cx  u(t,x  —  2 ct)  is  also  a  solution  to  the  heat  equation  for  any 
cGl.  What  solution  do  you  obtain  if  u(t,x)  =  a  is  a  constant  solution?  Remark :  This 
transformation  can  be  interpreted  as  the  effect  of  a  Galilean  boost  to  a  coordinate  frame 
that  is  moving  with  speed  c. 
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(b)  Prove  that  U(t,x) 


cx2  /  (4(1 Vet)) 


X 


u 


1  +  ct  ’  1  +  ct 


^/l  +  ct 

tion  for  any  cGl.  What  solution  do  you  obtain  if  u(t,  x) 


is  a  solution  to  the  heat  equa- 
=  a  is  a  constant? 


8.3  The  Maximum  Principle 

We  have  already  noted  the  temporal  decay  of  temperature,  as  governed  by  the  heat  equa¬ 
tion,  to  thermal  equilibrium.  While  the  temperature  at  any  individual  point  in  a  physical 
medium  can  fluctuate  —  depending  on  what  is  happening  elsewhere,  thermodynamics  tells 
us  that  the  overall  heat  content  of  an  isolated  body  must  continually  decrease.  The  Max¬ 
imum  Principle  is  the  mathematical  formulation  of  this  physical  law,  and  states  that  the 
temperature  of  a  body  cannot,  in  the  absence  of  external  heat  sources,  ever  become  larger 
than  its  initial  or  boundary  values.  This  can  be  viewed  as  a  dynamical  counterpart  to  the 
Maximum  Principle  for  the  Laplace  equation,  as  formulated  in  Theorem  4.9,  stating  that 
the  maximum  temperature  of  a  body  in  equilibrium  is  achieved  only  on  its  boundary. 

The  proof  of  the  Maximum  Principle  will  be  facilitated  if  we  analyze  the  more  general 
situation  in  which  heat  energy  is  being  continually  extracted  throughout  the  body. 

Theorem  8.6.  Let  7  >  0.  Suppose  n(t,  x)  is  a  solution  to  the  forced  heat  equation 

du  d2u 

-=1-+nt,x)  (8.65) 

on  the  rectangular  domain 


R  =  {a  <  x  <  b,  0  <  t  <  c}. 

Assume  that  the  forcing  term  is  nowhere  positive:  F(t,  x)  <  0  for  all  (t,  x)  E  R.  Then  the 
maximum  of  u(t,  x)  on  the  closed  rectangle  R  is  attained  at  t  =  0  or  x  =  a  or  x  =  b. 

In  other  words,  if  no  new  heat  is  being  introduced,  the  maximum  overall  temperature 
occurs  either  at  the  initial  time  or  on  the  body’s  boundary.  In  particular,  in  the  fully 
insulated  case  F(t,  x)  =  0,  (8.65)  reduces  to  the  heat  equation,  and  Theorem  8.6  applies 
as  stated. 


Proof:  First  let  us  first  prove  the  result  under  the  stronger  assumption  F(t,x)  <  0, 
which  implies  that 


du  d2u 
dt  <  ^  dx2 


(8.66) 


everywhere  in  the  rectangle  R.  Suppose  first  that  u(t,  x)  has  a  (local)  maximum  at  a  point 
(£*,£*)  in  the  interior  of  R.  Then,  by  multivariable  calculus,  [8,  108],  its  gradient  must 
vanish  there,  Vw(t*,a:*)  =  0,  and  hence 


ut(f,  x*)  =  ux{f,  x*)  =  0.  (8.67) 

Our  assumption  implies  that  the  scalar  function  h{x)  =  u(t*,  x)  has  a  maximum  at  x  =  x*. 
Thus,  by  the  second  derivative  test  for  functions  of  a  single  variable, 

ti'(x*)  =uxx(t*,x*)  <  0. 


(8.68) 
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But  the  requirements  (8.67-68)  are  clearly  incompatible  with  the  initial  inequality  (8.66). 
We  conclude  that  the  solution  u(t,  x)  cannot  have  a  local  maximum  at  any  point  in  the 
interior  of  R. 

We  still  need  to  exclude  the  possibility  of  a  maximum  occurring  at  a  non-corner  point 
(£*,£*)  =  (c,  #*),  a  <  x*  <  b,  on  the  right-hand  edge  of  the  rectangle.  If  such  were 
to  occur,  then  the  function  g{t)  =  iz(t,  £*)  would  be  nondecreasing  at  t  =  c,  and  hence 
g'(t )  =  ut(c,x*)  >  0  there.  The  preceding  argument  also  implies  that  uxx(c,  x*)  <  0,  and 
again  these  two  requirements  are  incompatible  with  (8.66).  We  conclude  that  any  (local) 
maximum  must  occur  on  one  of  the  other  three  sides  of  the  rectangle,  in  accordance  with 
the  statement  of  the  theorem. 

To  generalize  the  argument  to  the  case  F{t,x)  <  0  —  which  includes  the  heat  equation 
—  requires  a  little  trick.  Starting  with  the  solution  u(t,x)  to  (8.65),  we  set 


v(t,  x)  =  u{t,  x)  +  e  x2 ,  where  e  >  0. 


Then, 


dv  du  d2u  d2v 

M  =  M=1a7‘+F{t’x)  =  '<d7< 

where,  by  our  original  assumption  on  F(t,x), 


2ye  +  F(t,x)  =  7 


d2v 
dx 2 


+  F(t,x), 


F{t,x)  =  F{t,x)  —  2ye  <  0 

everywhere  in  R.  Thus,  by  the  previous  argument,  a  local  maximum  of  v(t,x)  can  occur 
only  when  t  =  0  or  x  =  a  or  x  =  b.  Now  we  let  £  ^  0  and  conclude  the  same  for  u.  More 
rigorously,  let  M  denote  the  maximum  value  of  u(t,  x )  on  the  indicated  three  sides  of  the 
rectangle.  Then 

v(t,  x)  <  M  +  e  max{  a2,  b2  } 
there,  and  hence,  by  the  preceding  argument, 

u(t,  x)  <  v(t,  x)  <  M  +  s  max{  a2,  b2  }  for  all  (t,  x)  G  R. 

Now,  letting  £  — 0+  proves  that  u(t,x)  <  M  everywhere  in  R.  Q.E.D. 

For  the  unforced  heat  equation,  we  can  bound  the  solution  from  both  above  and  below 
by  its  boundary  and  initial  temperatures: 

Corollary  8.7.  Suppose  u(t,x)  solves  the  heat  equation  ut  =  ~fuxx,  with  7  >  0,  for 
a  <  x  <  6,  0<t<c.  Set 


B  =  {  (0,x) 


a  <  x  <  b  }  U  {  (£,  a) 


0  <  t  <  c  }  U  {(£,&) 


0  <  t  <  c  }  , 


and  let 


M  =  max  {  u{t,  x)  \  (t,  x)  G  B  }  ,  m  —  min  {  u(t,  x)  \  (t,  x)  G  B  }  ,  (8.69) 

be,  respectively,  the  maximum  and  minimum  values  for  the  initial  and  boundary  temper¬ 
atures.  Then  m  <  u{t,  x)  <  M  for  all  a  <  x  <  b,  0  <  t  <  c. 

Proof :  The  upper  bound  u(t,x)  <  M  follows  from  the  Maximum  Principle  of  Theo¬ 
rem  8.6.  To  establish  the  lower  bound,  we  note  that  u{t,  x)  =  —  u{t,  x)  also  solves  the  heat 
equation,  satisfying  u{t,  x)  <  —monB,  and  hence,  by  the  Maximum  Principle,  everywhere 
in  the  rectangle.  But  this  implies  u{t,x )  =  —u{t,x)  >  m.  Q.E.D. 
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Remark :  Theorem  8.6  is  sometimes  referred  to  as  the  Weak  Maximum  Principle  for 
the  heat  equation.  The  Strong  Maximum  Principle  states  that,  provided  the  solution 
u(t,x)  is  not  constant,  its  value  at  any  non-initial,  non-boundary  point  (t,x)  £  R  = 
{a  <  x  <  6,  0  <  £  <  c}  is  strictly  less  than  its  maximum  initial  and  boundary  values;  in 
other  words,  u(t,x)  <  M  for  (t,x)  £  R,  where  M  is  given  in  (8.69).  Similarly,  the 
Strong  Maximum  Principle  implies  that,  for  nonconstant  solutions  to  the  heat  equation, 
the  inequalities  in  Corollary  8.7  are  strict:  m  <  u(t,  x)  <  M  for  all  (t,  x)  £  R.  Proofs  of 
the  Strong  Maximum  Principle  are  more  delicate,  and  can  be  found  in  [38,  61  . 

One  immediate  application  of  the  Maximum  Principle  is  to  prove  uniqueness  of  solu¬ 
tions  to  the  heat  equation. 

Theorem  8.8.  There  is  at  most  one  solution  to  the  Dirichlet  initial-boundary  value 
problem  for  the  forced  heat  equation. 

Proof :  Suppose  u  and  u  are  any  two  solutions  with  the  same  initial  and  boundary 
values.  Then  their  difference  v  —  u  —  u  solves  the  homogeneous  initial-boundary  value 
problem  for  the  unforced  heat  equation,  with  minimum  and  maximum  boundary  values 
m  —  0  <  v(t,  x)  <  0  =  M  for  t  =  0,  a  <  x  <  6,  and  also  x  =  a  or  6,  0  <  t  <  c.  But  then 
Corollary  8.7  implies  that  0  <  v(t,x)  <  0  everywhere,  which  implies  that  u  =  2,  thereby 
establishing  uniqueness.  Q.E.D. 

Remark :  Existence  of  the  solution  follows  from  the  convergence  of  our  Fourier  series 
-  assuming  that  the  initial  and  boundary  data  and  the  forcing  function  are  sufficiently 
nice. 


Exercises 


8.3.1.  True  or  false:  Assuming  no  external  heat  source,  if  the  initial  and  boundary  tempera¬ 
tures  of  a  one-dimensional  body  are  always  positive,  the  temperature  within  the  body  is 
necessarily  positive. 


8.3.2.  Suppose  u(t,  x)  and  v(t,  x)  are  two  solutions  to  the  heat  equation  such  that  u  <  v  when 
t  =  0  and  when  x  =  a  or  x  =  b.  Prove  that  u(t,  x)  <  v(t,  x)  for  all  a  <  x  <  b  and  all  t  >  0. 
Provide  a  physical  interpretation  of  this  result. 


8.3.3.  For  t  >  0,  let  u(£,  x)  be  a  solution  to  the  unforced  heat  equation  on  an  interval  a  <C  x  6, 
subject  to  homogeneous  Dirichlet  boundary  conditions.  Prove  that 
M(t)  =  max{  u(t,  x)  \  a  <  x  <  b}  is  &  nonincreasing  function  of  t. 


8.3.4.  (a)  State  and  prove  a  Maximum  Principle  for  the  convection- diffusion  equation 

H 

du 


u+  =  u  +  u  .  (b)  Does  the  equation  u+  =  u  —  u  also  admit  a  Maximum  Principle? 


8.3.5.  Consider  the  parabolic  equation 


dAu 

X  — -T  + 


on  the  interval  1  <  x  <  2,  with  initial 


dt  dx 2  '  dx 

and  boundary  conditions  u(0,  x)  =  /(#),  u(t,  1)  =  a(t),  u(t,  2)  =  Sit). 

(a)  State  and  prove  a  version  of  the  Maximum  Principle  for  this  problem. 

(b)  Establish  uniqueness  of  the  solution  to  this  initial-boundary  value  problem. 


8.3.6.  (a)  Show  that  u(t,x)  =  —x2  —  2xt  is  &  solution  to  the  diffusion  equation  ut  = 
(b)  Explain  why  this  differential  equation  does  not  admit  a  Maximum  Principle. 


x  a  . 
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8.3.7.  Suppose  that  u(t,x)  is  a  nonconstant  solution  to  the  heat  equation  on  the  interval 

0  <  x  <  £  when  subject  to  either  homogeneous  (a)  Dirichlet,  (b)  Neumann,  or  (c)  mixed 

r i  2 

boundary  conditions.  Prove  that  the  function  Eft)  =  /  u(t,x)  dx  is  everywhere 

J  o 

decreasing:  E{t^)  >  E(t2)  whenever  t1  <  t2. 


8.3.8.  True  or  false:  The  wave  equation  utt  =  c2uxx  satisfies  a  Maximum  Principle.  If  true, 


clearly  state  the  principle;  if  false,  explain  why  not. 


8.4  Nonlinear  Diffusion 


First-order  partial  differential  equations  serve  to  model  conservative  wave  motion,  begin¬ 
ning  with  the  basic  one-dimensional  scalar  transport  equations  that  we  studied  in  Chap¬ 
ter  2,  and  progressing  on  to  higher-dimensional  systems,  the  equations  of  gas  dynamics, 
the  full-blown  Euler  equations  of  fluid  mechanics,  and  yet  more  complicated  systems  of 
partial  differential  equations  modeling  plasmas,  magneto- hydrodynamics,  etc.  However, 
such  systems  fail  to  account  for  frictional  and  viscous  effects,  which  are  typically  modeled 
by  parabolic  diffusion  equations  such  as  the  heat  equation  and  its  generalizations,  both  lin¬ 
ear  and  nonlinear.  In  this  section,  we  investigate  the  consequences  of  combining  nonlinear 
wave  motion  with  linear  diffusion  by  analyzing  the  simplest  such  model.  As  we  will  see,  the 
dissipative  term  has  the  effect  of  smoothing  out  abrupt  shock  discontinuities,  and  the  re¬ 
sult  is  a  well-determined,  smooth  dynamical  process  with  classical  solutions.  Moreover,  in 
the  inviscid  limit,  the  smooth  solutions  converge  (nonuniformly)  to  a  discontinuous  shock 
wave,  leading  to  the  method  of  viscosity  solutions  that  has  been  successfully  employed  to 
analyze  such  nonlinear  dynamical  processes. 

Burgers 9  Equation 


The  simplest  nonlinear  diffusion  equation  is  known  as^  Burgers ;  equation 


ut  +  uux  =  *fU 


XX ’ 


(8.70) 


which  is  obtained  by  appending  a  simple  linear  diffusion  term  to  the  nonlinear  transport 
equation  (2.31).  As  with  the  heat  equation,  the  diffusion  coefficient  7  >  0  must  be  nonneg¬ 
ative  in  order  that  the  initial  value  problem  be  well-posed  in  forwards  time.  In  fluid  and 
gas  dynamics,  one  interprets  the  right-hand  side  as  modeling  the  effect  of  viscosity,  and 
so  Burgers’  equation  represents  a  very  simplified  version  of  the  equations  of  viscous  fluid 
flows,  including  the  celebrated  and  widely  applied  Navier-Stokes  equations  (1.4),  [122  . 
When  the  viscosity  coefficient  vanishes,  7  =  0,  Burgers’  equation  reduces  to  the  nonlinear 
transport  equation  (2.31),  which,  as  a  consequence,  is  often  referred  to  as  the  inviscid 
Burgers ’  equation. 


^  The  equation  is  named  after  the  Dutch  physicist  Johannes  Martinus  Burgers,  [26],  and  so 
the  apostrophe  goes  after  the  “s”.  Burgers’  equation  was  apparently  first  studied  as  a  physical 
model  by  the  British  (later  American)  applied  mathematician  Harry  Bateman,  [13],  in  the  early 
twentieth  century. 
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Since  Burgers’  equation  is  of  first  order  in  £,  we  expect  that  its  solutions  will  be 
uniquely  prescribed  by  their  initial  values 


n(0,  x )  =  f(x) 


—  OO  <  X  <  oo. 


(8.71) 


(For  simplicity,  we  will  ignore  boundary  effects  here.)  Small,  slowly  varying  solutions  - 
more  specifically,  those  for  which  both  |  u(t,  x)  |  and  |  ux(t ,  x)  |  are  small  —  tend  to  act  like 
solutions  to  the  heat  equation,  smoothing  out  and  decaying  to  0  as  time  progresses.  On  the 
other  hand,  when  the  solution  is  large  or  rapidly  varying,  the  nonlinear  term  tends  to  play 
the  dominant  role,  and  we  might  expect  the  solution  to  behave  like  nonlinear  transport 
waves,  perhaps  steepening  into  some  sort  of  shock.  But,  as  we  will  learn,  the  smoothing 
effect  of  the  diffusion  term,  no  matter  how  small,  ultimately  prevents  the  appearance  of  a 
discontinuous  shock  wave.  Indeed,  it  can  be  proved  that,  under  rather  mild  assumptions 
on  the  initial  data,  the  solution  to  the  initial  value  problem  (8.70-71)  remains  smooth  and 
well  defined  for  all  subsequent  times,  [122]. 

The  simplest  explicit  solutions  are  the  traveling  waves ,  for  which 


u(t,  x)  =  =  v(x  —  ct), 


where 


£  =  x  —  ct, 


(8.72) 


indicates  a  fixed  profile,  moving  to  the  right  with  constant  speed  c.  By  the  chain  rule. 


du 

dt 


=  -cv\0 


du 

dx 


=  NO 


d2 


u 


v"(0 


dx2 


Substituting  these  expressions  into  Burgers’  equation  (8.70),  we  conclude  that  must 
satisfy  the  nonlinear  second-order  ordinary  differential  equation 

—  cv'  +  vv'  —  ^v" . 

This  equation  can  be  solved  by  first  integrating  both  sides  with  respect  to  £,  and  so 

i  i  o 

7 v  —  k  —  cv  +  |  v  , 

where  k  is  a  constant  of  integration.  Following  the  analysis  after  Proposition  2.3,  as 
£  -N  Too,  the  bounded  solutions  to  such  an  autonomous  first-order  ordinary  differential 
equation  tend  to  one  of  the  fixed  points  provided  by  the  roots  of  the  quadratic  polynomial 
on  the  right-hand  side.  Therefore,  for  there  to  be  a  bounded  traveling- wave  solution  'c(^), 
the  quadratic  polynomial  must  have  two  real  roots,  which  requires  k  <  |c2.  Assuming 
this  holds,  we  rewrite  the  equation  in  the  form 


dv  v 

27  d£  =  (V  ~  a^V  ~  b ' 


where 


c  =  \  (a  +  b) 


*1 

k  =  7,  ab. 


(8.73) 


To  obtain  bounded  solutions,  we  must  require  a  <  v  <  b.  Integrating  (8.73)  by  the  usual 
method,  cf.  (2.19),  we  find 


f  2y  dv  2  7  1  (b~ v 

J  [v  —  a){v  —  b)  b  —  a  \v  —  a 

where  5  is  another  constant  of  integration.  Solving  for 

ae(fr-a)(£-T/(27)  5 
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Figure  8.5. 


Traveling- wave  solutions  to  Burgers’  equation. 


and  recalling  (8.73),  we  conclude  that  the  bounded  traveling- wave  solutions  to  Burgers’ 
equation  all  have  the  explicit  form 


u(t,  x)  = 


a 


e(b-a){x-ct-8)/(2~i)  _|_  £ 


e{b-a)(x-ct-S)/(2'y)  \ 


(8.74) 


where  a  <  b  and  S  are  arbitrary  constants.  Observe  that  our  solution  is  a  monotonically 
decreasing  function  of  x,  with  asymptotic  values 


X 


lim  u(t,  x)  =  6, 

— y  —  oo 


lim  u(t,  x)  =  a. 

x  — >  oo 


at  large  distances.  The  wave  travels  to  the  right,  unchanged  in  form,  with  speed  c—  \  (a +6) 
equal  to  the  average  of  its  asymptotic  values.  In  particular,  if  a  =  —  6,  the  result  is  a 
stationary- wave  solution.  In  Figure  8.5  we  graph  sample  profiles,  corresponding  to  a  =  .1, 
6=1,  for  three  different  values  of  the  diffusion  coefficient.  Note  that  the  smaller  7  is,  the 
sharper  the  transition  layer  between  the  two  asymptotic  values  of  the  solution. 

In  the  inviscid  limit  as  the  diffusion  becomes  vanishingly  small,  7^0,  the  traveling- 
wave  solutions  (8.74)  converge  to  the  step  shock-wave  solutions  (2.51)  of  the  nonlinear 
transport  equation.  Indeed,  this  can  be  proved  to  hold  in  general:  as  7  — 0,  solutions  to 
Burgers’  equation  (8.70)  converge  to  the  corresponding  solutions  to  the  nonlinear  transport 
equation  (2.31)  that  are  subject  to  the  Rankine-Hugoniot  and  entropy  conditions  (2.53,  55). 
Thus,  the  method  of  vanishing  viscosity  allows  one  to  monitor  solutions  to  the  nonlinear 
transport  equation  as  they  evolve  into  regimes  where  multiple  shocks  interact  and  merge. 
This  approach  also  reconfirms  our  physical  intuition,  in  that  most  physical  systems  retain 
a  very  small  dissipative  component  that  serves  to  mollify  abrupt  discontinuities  that  might 
appear  in  a  theoretical  model  that  fails  to  take  friction  or  viscous  effects  into  account.  In 
the  modern  theory  of  partial  differential  equations,  the  resulting  viscosity  solution  method 
has  been  successfully  used  to  characterize  the  discontinuous  solutions  to  a  broad  range  of 
inviscid  nonlinear  wave  equations  as  limits  of  classical  solutions  to  a  viscously  regularized 
system.  We  refer  the  interested  reader  to  [64, 107, 122]  for  further  details. 


The  Hop f -Cole  Transformation 

By  a  remarkable  stroke  of  good  fortune,  the  nonlinear  Burgers’  equation  can  be  con¬ 
verted  into  the  linear  heat  equation  and  thereby  explicitly  solved.  The  transformation 
that  linearizes  the  nonlinear  Burgers’  equation  first  appeared  in  an  obscure  exercise  in  a 
nineteenth-century  differential  equations  textbook,  [41;  vol.  6,  p.  102].  Its  rediscovery  by 


318 


8  Linear  and  Nonlinear  Evolution  Equations 


the  applied  mathematicians  Eberhard  Hopf,  [56],  and  Julian  Cole,  [32],  was  a  milestone 
in  the  modern  era  of  nonlinear  partial  differential  equations,  and  it  is  now  named  the 
Hopf-Cole  transformation  in  their  honor. 


In  general,  linearization  —  that  is,  converting  a  given  nonlinear  differential  equation 
into  a  linear  equation  —  is  extremely  challenging,  and,  in  most  instances,  impossible.  On 
the  other  hand,  the  reverse  process  —  “nonlinearizing”  a  linear  equation  —  is  trivial: 
any  nonlinear  change  of  dependent  variables  will  do  the  trick!  However,  the  resulting 
nonlinear  equation,  while  evidently  linearizable  by  inverting  the  change  of  variables,  is 
rarely  of  independent  interest.  But  sometimes  there  is  a  lucky  accident,  and  the  resulting 
linearization  of  a  physically  relevant  nonlinear  differential  equation  can  have  a  profound 
impact  on  our  understanding  of  more  complicated  nonlinear  systems. 


In  the  present  context,  our  starting  point  is  the  linear  heat  equation 


vt  =  rrvxx-  (8-75) 

Among  all  possible  nonlinear  changes  of  dependent  variable,  one  of  the  simplest  that  might 
spring  to  mind  is  an  exponential  function.  Let  us,  therefore,  investigate  the  effect  of  an 
exponential  change  of  variables 

v(t,  x)  =  eacp(<t,x\  so  (/?(£,  x)  =  —  logv(t,  x),  (8.76) 

where  a  is  a  nonzero  constant.  The  function  ip(t,x)  is  real,  provided  v(t,x)  is  a  positive 
solution  to  the  heat  equation.  Fortunately,  this  is  not  hard  to  arrange:  if  the  initial 
data  v(Q,x)  >  0  is  strictly  positive,  then,  as  a  consequence  of  the  Maximum  Principle  in 
Corollary  8.7,  the  resulting  solution  v{t,x)  >  0  is  positive  for  all  t  >  0. 

To  determine  the  differential  equation  satisfied  by  the  function  ip,  we  invoke  the  chain 
and  product  rules  to  differentiate  (8.76): 

vt  =  oupteav,  vx=aipxeav,  vxx  =  {aipxx  +  a2  px)  eav . 

Substituting  the  first  and  last  formulas  into  the  heat  equation  (8.75)  and  canceling  a  com¬ 
mon  exponential  factor,  we  conclude  that  ip(t,  x)  satisfies  the  nonlinear  partial  differential 
equation 

<Pt  =  +  (8-77) 

known  as  the  potential  Burgers ;  equation ,  for  reasons  that  will  soon  become  apparent. 

The  second  step  in  the  process  is  to  differentiate  the  potential  Burgers’  equation  with 


respect  to  x\  the  result  is 

^ tx  XXX  "t"  2  N  Oi  P1  x  xx' 

(8.78) 

If  we  now  set 

d(p 

0  =  U1 

OX 

(8.79) 

so  that  ip  acquires  the  status  of  a  potential  function ,  then  the  resulting  partial  differential 
equation 

ut  —  "fuxx  +  2y  auux 


coincides  with  Burgers’  equation  (8.70)  when  a  =  —1/(27).  In  this  manner,  we  have 
arrived  at  the  famous  Hopf-Cole  transformation. 
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Figure  8.6. 


Trignometric  solution  to  Burgers’  equation. 


Theorem  8.9.  If  v(t,x)  >  0  is  any  positive  solution  to  the  linear  heat  equation 


Vt  —  TVXX’  then 


u(t ,  x) 


d_ 

dx  1 


—  2  7  log  v(t,x)  =—27 


v 


X 


V 


(8.80) 


solves  Burgers’  equation  ut  +  uux  =  ju 


J  ry*  nr* 


Do  all  solutions  to  Burgers’  equation  arise  in  this  way?  In  order  to  answer  this  question, 
we  run  the  argument  in  reverse.  First,  choose  a  potential  function  £>(£,  x)  that  satisfies 
(8.79);  for  example, 

rX 

<p(t,x)=  /  u(t,  y)  dy. 

Jo 

If  u(t,  x )  is  any  solution  to  Burgers’  equation,  then  (/?(£,  x)  satisfies  (8.78).  Integrating  both 
sides  of  the  latter  equation  with  respect  to  r,  we  conclude  that 


=  i(Pxx  +  i°l(pI  +  g(t), 

for  some  integration  “constant”  g(t).  Thus,  unless  g(t)  =  0,  our  potential  function  fp 
doesn’t  satisfy  the  potential  Burgers’  equation  (8.77),  but  that  is  because  we  chose  the 
“wrong”  potential.  Indeed,  if  we  define 


(/?(£,  x)  —  (/?(£,  x)  —  G(t),  where  G'(t)=g(t), 

then 

vt  =  vt-  g(t)  =  iVxx  +  7  <*vl  =  iVxx  +  7^4- 

and  hence  the  modihed  potential  ip(t,  x)  is  a  solution  to  the  potential  Burgers’  equation 
(8.77).  From  this  it  easily  follows  that 

v(t,x)  =  e~^x)/(2't)  (8.81) 

is  a  positive  solution  to  the  heat  equation,  from  which  the  Burgers’  solution  u(t,  x)  can 
be  recovered  through  (8.80).  We  conclude  that  every  solution  to  Burgers’  equation  comes 
from  a  positive  solution  to  the  heat  equation  via  the  Hopf-Cole  transformation. 
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Example  8.10.  As  a  simple  example,  the  separable  solution 


—  2  , 

u(£,  x)  =  a  +  be~1UJ  cos  ujx 

to  the  heat  equation  leads  to  the  following  solution  to  Burgers’  equation: 


u(£,  x)  = 


27  buj  sincex 
a  e^u2t  +  b  cos  ujx 


(8.82) 


A  representative  example  is  plotted  in  Figure  8.6.  We  should  require  that  a  >  \b\  in 
order  that  v(t,x)  >  0  be  a  positive  solution  to  the  heat  equation  for  £  >  0;  otherwise  the 
resulting  solution  to  Burgers’  equation  will  have  singularities  at  the  roots  of  u  —  as  in 
the  first  graph  in  Figure  8.6.  This  family  of  solutions  is  primarily  affected  by  the  viscosity 
term,  and  rapidly  decays  to  zero. 

To  solve  the  initial  value  problem  (8.70-71)  for  Burgers’  equation,  we  note  that,  under 
the  Hopf-Cole  transformation  (8.80), 


u(0,  x)  =  exp  (  — 


27 


=  exp 


27 


f  (y)  dyj  =  h(x) . 


(8.83) 


Remark :  The  lower  limit  of  the  integral  can  be  changed  from  0  to  any  other  convenient 
value.  The  only  effect  is  to  multiply  v(t,x)  by  an  overall  constant,  which  does  not  change 
the  final  form  of  u(t,x)  in  (8.80). 

According  to  formula  (8.16)  (adapted  to  general  diffusivity,  as  in  Exercise  8.2.3),  the 
solution  to  the  initial  value  problem  (8.75,  83)  for  the  heat  equation  can  be  expressed  as  a 
convolution  integral  with  the  fundamental  solution 


u(£,  x) 


1 


2Xnt 


*00 


■  00 


0-£)2/(47 1) 


MO  #■ 


Therefore,  setting  u(£,  x)  =  2^/n^t  u(£,  x),  the  solution  to  the  Burgers’  initial  value  problem 
(8.70-71),  valid  for  £  >  0,  is  given  by 


u(t,  x)  =  — 


2y  dv 


v(t,  x)  dx 


where 


u(£,  x) 


< 


*00 


■00 


H (t,x;£) 


de 


< 


ff(t-i;0  =  7r"  +  md” 


(8.84) 


Example  8.11.  To  demonstrate  the  smoothing  effect  of  the  diffusion  terms,  let  us 
see  what  happens  to  the  initial  data 


u(0,  x)  = 


а,  x  <  0, 

б,  x  >  0, 


(8.85) 


in  the  form  of  a  step  function.  We  assume  that  a  >  b,  which  corresponds  to  a  shock  wave 
in  the  inviscid  limit  7  =  0.  (In  Exercise  8.4.4,  the  reader  is  asked  to  analyze  the  case  a  <  6, 
which  corresponds  to  a  rarefaction  wave.)  In  this  case, 


H(t,x;£)  = 


4  y£ 


+  < 


27 

hA 

27 


£<o, 


£>0. 


(8.86) 
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t  —  .  01  t  =  .5 


Figure  8.7.  Shock- wave  solution  to  Burgers’  equation.  (+J 


After  some  algebraic  manipulations,  the  solution  (8.84)  is  found  to  have  the  explicit  form 


u(£,  x)  =  a  + 


b  —  a 


1  +  exp 


b  —  a 


(x  —  ct)  )  erfc 


x  —  at 

2  Vrt 


erfc 


bt  —  x 

2  Vrt 


(8.87) 


with  c  =  A  (a  +  b),  where  erfc  z  =  1  —  erf  z  denotes  the  complementary  error  function 
(8.43).  The  solution,  for  a  =  1,  b  =  .1,  and  7  =  .03,  is  plotted  at  various  times  in 
Figure  8.7.  Observe  that,  as  with  the  heat  equation,  the  jump  discontinuity  is  immediately 
smoothed  out,  and  the  solution  soon  assumes  the  form  of  a  smoothly  varying  transition 
between  its  two  original  heights.  The  larger  the  diffusion  coefficient  in  relation  to  the 
jump  magnitude,  the  more  pronounced  the  smoothing  effect.  Moreover,  as  7  — )►  0,  the 
solution  u(t,x)  converges  to  the  shock-wave  solution  (2.51)  to  the  transport  equation,  in 
which  the  speed  of  the  shock  is  c,  the  average  of  the  step  heights  —  in  accordance  with 
the  Rankine-Hugoniot  shock  rule.  Indeed,  in  view  of  (2.88), 


lim  erfcz  =  0,  lim  erfcz  =  2. 

2  — >  00  z  — >  —  00 


(8.88) 


Thus,  for  t  >  0,  as  7  — 0,  the  ratio  of  the  two  complementary  error  functions  in  (8.87) 
tends  to  00  when  x  <  bt,  to  1  when  bt  <  x  <  at,  and  to  0  when  x  >  at.  On  the  other 
hand,  since  a  >  b,  the  exponential  term  tends  to  00  when  x  <  ct,  and  to  0  when  x  >  ct. 
Put  together,  these  imply  that  the  solution  u(t,x)  — ?►  a  when  x  <  ct ,  while  u(t,x)  — ?►  6, 
when  x  >  ct,  thus  proving  convergence  to  the  shock- wave  solution. 


Example  8.12.  Consider  the  case  in  which  the  initial  data  u( 0,x)  =  5(x)  is  a 
concentrated  delta  function  impulse  at  the  origin.  In  the  solution  formula  (8.84),  starting 
the  integral  for  i7(t,  x\ £)  at  0  is  problematic,  but  as  noted  earlier,  we  are  free  to  select  any 
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Figure  8.8.  Triangular-wave  solution  to  Burgers’  equation.  (+J 


other  starting  point,  e.g.,  —  oo.  Thus,  we  take 


(®-0: 


(x-02  1 


4y£  27 


5(rj)  dr]  =  < 


00 


47 1 

1  (*-C)2 


27  4y£ 


C<o. 


C  >  0. 


We  then  evaluate 


v 


/oo 

e-H(t,X&)  ^ 

-00 


1  —  erf 


x 


2^/rt 


+  e  <  1  +  erf 


X 


2^/rt 


Therefore,  the  solution  to  the  initial  value  problem  is 


u(t,  x)  =  — 


27  dv 


=  2 


7 


e-x2/(47t) 


v{t,  x)  dx  \  7rt 


coth 


47 


—  erf 


x 


(8.89) 


2  Vrt 


where 

_  coshz  eZjre~z  e2z-\-l 
coth  z  =  — - —  =  - =  — - 

sinhz  ez  —  e  z  e2z  —  1 


is  the  hyperbolic  cotangent  function.  A  graph  of  this  solution  when  7  =  .02  and  a  —  1 
appears  in  Figure  8.8.  As  you  can  see,  the  initial  concentration  diffuses  out,  but,  in  contrast 
to  the  heat  equation,  does  not  remain  symmetric,  since  the  nonlinear  advection  term  causes 
the  wave  to  steepen  in  front.  Eventually,  as  the  effect  of  the  diffusion  accumulates,  the 
propagating  triangular  wave  becomes  vanishingly  small. 
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Exercises 


8.4.1.  Find  the  solution  to  Burgers’  equation  that  has  the  following  initial  data: 

u(0,x)=  (a)  a(x),  (b)a(-x),  (c)  {  J’ 

8.4.2.  Starting  with  the  heat  equation  solution  v(t,x)  =  1  +  e~x  find  the  corre¬ 

sponding  solution  to  Burgers’  equation  and  discuss  its  behavior. 

8.4.3.  Justify  the  solution  formula  (8.87). 

z2  r~ 

8.4.4.  (a)  Prove  that  lim  ze  erfcz  =  l/v7r.  (b)  Show  that  when  a  <  6,  the  Burgers’ 

z  — >  oo 

solution  (8.87)  converges  to  the  rarefaction  wave  (2.54)  in  the  inviscid  limit  7  — >>  0+. 

8.4.5.  True  or  false:  If  u(t,x)  solves  Burgers’  equation  for  the  step  function  initial  condition 
u( 0,  x)  =  cr(x),  then  v(t,x)  =  ux(t,x)  solves  the  initial  value  problem  with  u(0,x)  =  S(x). 

8.4.6.  True  or  false:  If  v(t,x)  is  as  given  in  (8.84),  then 


dv 

dx 


■00 


tz£e-H&x&dZ, 

OO  2q  t 


and  hence  the  solution  to  the  Burgers’  initial  value  problem  (8.70-71)  can  be  written  as 

/,°°  X~A  2 


dti 


u(t,  x) 


x 


where  H(t,x;£)  = 


(x-Q- 

47  t 


+ 


1  rC 


27  Jo 


/(» ?)  dv- 


—  OO 


8.4.7.  Show  that  if  u(t,x)  solves  Burgers’  equation,  then  U(t,x)  =  u(t,x  —  ct )  +  c  is  also  a 
solution.  What  is  the  physical  interpretation  of  this  symmetry? 

8.4.8.  (a)  What  is  the  effect  of  a  scaling  transformation  (t,x,u)  1 — >  (at,  f3x,  \u)  on  Burgers’ 
equation?  (b)  Use  your  result  to  solve  the  initial  value  problem  for  the  rescaled  Burgers’ 
equation  Ut  +  pUUx  =  crUxx ,  U(0,x)  =  F(x). 

T?  8.4.9.  (a)  Find  all  scaling  symmetries  of  Burgers’  equation,  (b)  Determine  the  ordinary 

differential  equation  satisfied  by  the  similarity  solutions,  (c)  True  or  false:  The  Hopf-Cole 
transformation  maps  similarity  solutions  of  the  heat  equation  to  similarity  solutions  of 
Burgers’  equation. 

8.4.10.  What  happens  if  you  nonlinearize  the  heat  equation  (8.75)  using  the  change  of 
variables 

(a)  v  =  ip2;  (b)  v  =  ^p;  (c)  v  =  log tp? 

8.4.11.  What  partial  differential  equation  results  from  applying  the  exponential  change  of 
variables  (8.76)  to: 

(a)  the  wave  equation  vtt  =  c2vxx ?  (b)  the  Laplace  equation  vxx  +  vyy  =  0? 


8.5  Dispersion  and  Solitons 

In  this  section,  we  finally  venture  beyond  the  by  now  familiar  terrain  of  second-order 
partial  differential  equations.  While  considerably  less  common  than  those  of  first  and 
second  order,  higher-order  equations  arise  in  certain  applications,  particularly  third-order 
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dispersive  models  for  wave  motion,  [2,  122],  and  fourth-order  systems  modeling  elastic 
plates  and  shells,  [7].  We  will  focus  our  attention  on  two  basic  third-order  evolution 
equations.  The  first  is  a  simple  linear  equation  with  a  third  derivative  term.  It  arises  as 
a  simplified  model  for  unidirectional  wave  motion,  and  thus  has  more  in  common  with 
first-order  transport  equations  than  with  the  second-order  dissipative  heat  equation.  The 
third-order  derivative  induces  a  process  of  dispersion ,  in  which  waves  of  different  frequencies 
propagate  at  different  speeds.  Thus,  unlike  the  first-  and  second-order  wave  equations,  in 
which  waves  maintain  their  initial  profile  as  they  move,  dispersive  waves  will  spread  out  and 
decay  even  while  conserving  energy.  Waves  on  the  surface  of  a  liquid  are  familiar  examples 
of  dispersive  waves  —  an  initially  concentrated  disturbance,  caused  by,  say,  throwing  a 
rock  in  a  pond,  spreads  out  over  the  surface  as  its  different  vibrational  components  move 
off  at  different  speeds. 

Our  second  example  is  a  remarkable  nonlinear  third-order  evolution  equation  known 
as  the  Korteweg-de  Vries  equation,  which  combines  dispersive  effects  with  nonlinear  trans¬ 
port.  As  with  Burgers’  equation  (but  for  very  different  mathematical  reasons),  the  dis¬ 
persive  term  thwarts  the  tendency  for  solutions  to  break  into  shock  waves,  and,  in  fact, 
classical  solutions  exist  for  all  time.  Moreover,  a  general  localized  initial  disturbance  will 
break  up  into  a  finite  number  of  solitary  waves;  the  taller  the  wave,  the  faster  it  moves. 
Even  more  remarkable  are  the  interactive  properties  of  these  solitary  waves.  One  ordinar¬ 
ily  expects  nonlinearity  to  induce  very  complicated  and  not  easily  predictable  behavior. 
However,  when  two  solitary- wave  solutions  to  the  Korteweg-de  Vries  equation  collide,  they 
eventually  emerge  from  the  interaction  unchanged,  save  for  a  phase  shift.  This  unexpected 
and  remarkable  phenomenon  was  first  detected  through  numerical  simulations  in  the  1960s 
and  distinguished  with  the  neologism  soliton.  It  was  then  found  that  solitons  appear  in 
a  surprising  number  of  basic  nonlinear  physical  models.  The  investigation  of  their  mathe¬ 
matical  properties  has  had  deep  ramifications,  not  just  within  partial  differential  equations 
and  fluid  mechanics,  but  throughout  applied  mathematics  and  theoretical  physics;  it  has 
even  contributed  to  the  solution  of  long-outstanding  problems  in  complex  function  theory. 
Further  development  of  the  modern  theory  and  amazing  properties  of  integrable  soliton 
equations  can  be  found  in  [2,36]. 


Linear  Dispersion 

The  simplest  nontrivial  third-order  partial  differential  equation  is  the  linear  equation 

Ut  +  Uxxx  =  (8-90) 

which  models  the  unidirectional^  propagation  of  linear  dispersive  waves.  To  avoid  compli- 
cations  engendered  by  boundary  conditions,  we  shall  initially  look  only  at  solutions  on  the 
entire  line,  so  —  oo  <  x  <  oo.  Since  the  equation  involves  only  a  first-order  time  derivative, 
one  expects  its  solutions  to  be  uniquely  specified  by  a  single  initial  condition 

u( 0,x)  =  /(x),  —  oo  <  x  <  oo.  (8.91) 


^  Bidirectional  propagation,  as  we  saw  in  the  wave  equation,  requires  a  second-order  time 
derivative.  As  in  the  d’Alembert  solution  to  the  second-order  wave  equation,  the  reduction  to  a 
unidirectional  model  is  based  on  an  (approximate)  factorization  of  the  bidirectional  operator. 
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t  =  1 

Figure  8.9. 


t  —  5  t  — 

Gaussian  solution  to  the  dispersive  wave  equation. 


In  wave  mechanics,  u(t,x)  represents  the  height  of  the  fluid  at  time  t  and  position  x,  and 
the  initial  condition  (8.91)  specifies  the  initial  disturbance. 

As  with  the  heat  equation  (and,  indeed,  any  linear  constant-coefficient  evolution  equa¬ 
tion),  the  Fourier  transform  is  an  effective  tool  for  solving  the  initial  value  problem  on  the 
real  line.  Assuming  that  the  solution  u(t,  •)  £  L2(IR)  remains  square  integrable  at  all  times 
t  (a  fact  that  can  be  justified  a  priori  —  see  Exercise  8.5.18(b)),  let 


u(t,  k ) 


1 


*oo 


u{t,x)e  lkx  dx 


V2 


7 r  J  — 


oo 


be  its  spatial  Fourier  transform.  Owing  to  its  effect  on  derivatives,  the  Fourier  transform 
converts  the  partial  differential  equation  (8.90)  into  a  first-order  linear  ordinary  differential 
equation: 


0 U  ,  7  x  q  ^  dU  .  7  q  ^ 

+  ( l  k)  u  =  — - i  k  u  =  0, 


dt 


dt 


(8.92) 


in  which  the  frequency  variable  k  appears  as  a  parameter.  The  corresponding  initial  con¬ 
ditions 

—  1  f°° 

u(0,  k )  =  f(k)  =  _  /  f(x)e~lkxdx  (8.93) 

v  2  7T  J — oo 

are  provided  by  the  Fourier  transform  of  (8.91).  The  solution  to  the  initial  value  problem 
(8.92-93)  is 

u(£,  k )  =  f{k )  elk  1 . 

Inverting  the  Fourier  transform  yields  the  explicit  formula  for  the  solution 


“OO 


u(t,  x)  — 


72 


7 r  J- 


f[k)e{{kx+k3t)  dk 


(8.94) 


OO 


to  the  initial  value  problem  (8.90-91)  for  the  dispersive  wave  equation. 
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Example  8.13.  Suppose  that  the  initial  profile 

u(  0,x)  =  f(x )  =  e~x 

is  a  Gaussian.  According  to  our  table  of  Fourier  transforms  (see  page  272), 

e-^2/4 
f(k)  =  ~VT~  ’ 

and  hence  the  corresponding  solution  to  the  dispersive  wave  equation  (8.90)  is 


u(£,  x) 


e  i  ( kx  +  k 3  t)  —  k2  / 4 


e  k  ^  cos(kx  +  k3 t)  dk; 


the  imaginary  part  vanishes  thanks  to  the  oddness  of  the  integrand.  (Indeed,  the  solution 
must  be  real,  since  the  initial  data  is  real.)  A  plot  of  the  solution  at  various  times  appears 
in  Figure  8.9.  Note  the  propagation  of  initially  rapid  oscillations  to  the  rear  (negative  x) 
of  the  initial  disturbance.  The  dispersion  causes  the  oscillations  to  gradually  spread  out 
and  decrease  in  amplitude,  with  the  effect  that  u(t,x)  — 0  uniformly  as  t  — oo,  even 

/oo 

u(t,x)dx  and  the  energy 

-oo 


‘OO 


E  = 


u(t,x)2  dx  of  the  wave  are  conserved,  i.e.,  are  both  constant  in  time. 


—  OO 


Example  8.14.  The  fundamental  solution  to  the  dispersive  wave  equation  is  gener¬ 
ated  by  a  concentrated  initial  disturbance: 


u(0,  x)  =  5(x) 


The  Fourier  transform  of  the  delta  function  is  just  5{k)  =  l/y/2i r .  Therefore,  the  corre¬ 
sponding  solution  (8.94)  is 


u(t,  x)  = 


27T 


‘OO 


■oo 


‘OO 


ei(kx+k  t)  fife  _  _  /  cos (kx  +  k3 1)  dk, 

71  J  0 


(8.95) 


since  the  solution  is  real  (or,  equivalently,  the  imaginary  part  of  the  integrand  is  odd), 
while  the  real  part  of  the  integrand  is  even. 

A  priori,  it  appears  that  the  integral  (8.95)  does  not  converge,  because  the  integrand 
does  not  go  to  zero  as  |  k  \  oo.  However,  the  increasingly  rapid  oscillations  induced  by 
the  cubic  term  tend  to  cancel  each  other  out  and  allow  convergence.  To  prove  this,  given 
l  >  0,  we  perform  a  (non-obvious)  integration  by  parts: 


cos {kx  +  k3t)  dk  = 


d 


o 


0 


x  +  3  k2 1  dk 


sin  (kx  +  k3 1 )  dk 


(8.96) 


sin  (kx  +  k3t) 
x  +  3  k2 1 

sin  (lx  +  l3 1) 
x  +  3l2t 


d 


1 


k  =  0  ^0 

l 


+ 


dk  V  x  +  3  k2 1 
6ktsm(kx  +  k3 1) 


sin  (kx  +  k3 1)  dk 


o 


(x  +  3  k2t)‘ 


dk. 


Provided  t  ^  0,  as  l  oo,  the  first  term  on  the  right  goes  to  zero,  while  the  final  integral 
converges  absolutely  due  to  the  rapid  decay  of  the  integrand. 
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Figure  8.10. 


Fundamental  solution  to  the  dispersive  wave  equation. 


While  the  integral  in  the  solution  formula  (8.95)  cannot  be  evaluated  in  terms  of 
elementary  functions,  it  is  related  to  the  integral  defining  the  Airy  function 

Ai(z)  =  —  /  cos  (sz  +  h  s3)  ds, 

W  o 

an  important  special  function,  [86],  that  was  first  employed  by  the  nineteenth-century 
British  applied  mathematician  George  Airy  in  his  studies  of  optical  caustics  (the  focusing 
of  light  waves  through  a  lens,  e.g.,  a  magnifying  glass)  and  rainbows,  [4].  Indeed,  applying 
the  change  of  variables 

s  =  k  \/3 £ ,  2  =  -J=  , 

\/3t 

to  the  Airy  function  integral  (8.97),  we  deduce  that  the  fundamental  solution  to  the  dis¬ 
persive  wave  equation  (8.90)  can  be  written  as 

/  \  1  A .  f  x 

u(t,x)  =  —=  Ai  — = 

y/U  V\/3 1 

See  Figure  8.10  for  a  graph  of  the  solution  at  several  times;  in  particular,  at  t  =  1/3 
the  solution  is  exactly  the  Airy  function.  We  see  that  the  immediate  effect  of  the  initial 
delta  impluse  is  to  spawn  a  highly  oscillatory  wave  trailing  off  to  —  oo.  (As  with  the  heat 
equation,  signals  propagate  with  infinite  speed.)  As  time  progresses,  the  dispersive  effects 
cause  the  oscillations  to  spread  out,  with  their  overall  amplitude  decaying  in  proportion  to 
t-1/3.  On  the  other  hand,  as  t  0+,  the  solution  becomes  more  and  more  oscillatory  for 
negative  x,  and  so  converges  weakly  to  the  initial  delta  function.  We  also  note  that  (8.98) 
has  the  form  of  a  similarity  solution,  since  it  is  invariant  under  the  scaling  symmetry 


(8.98) 


(8.97) 


o  1 

(£,  x,  u)  i — >  (A  t,  A  Xu). 

Equation  (8.98)  gives  the  response  to  an  initial  delta  function  concentrated  at  the 
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Figure  8.11.  Periodic  dispersion  at  irrational  (with  respect  to  7r)  times,  [dj 


origin.  By  translation  invariance,  we  immediately  deduce  that 


1 

^3 1 


x-S, 
^3 1 


is  the  fundamental  solution  corresponding  to  an  initial  delta  impulse  at  x  =  Therefore, 
we  can  use  linear  superposition  to  find  an  explicit  formula  for  the  solution  to  the  initial 
value  problem  that  bypasses  the  Fourier  transform.  Namely,  writing  the  general  initial 
data  as  a  superposition  of  delta  functions, 


/oo 

/(£)  5{x  ~  0  d£, 

-co 


we  conclude  that  the  resulting  solution  is  the  selfsame  combination  of  fundamental  solu¬ 
tions: 


n(£,  x) 


1 


‘CO 


m  Ai 


■co 


x~j\ 

m) 


(8.99) 


Example  8.15.  Dispersive  Quantization.  Let  us  investigate  the  periodic  initial¬ 
boundary  value  problem  for  our  basic  linear  dispersive  equation  on  the  interval  —  tt  <  x  <  tt: 


ut  +  uxxx  =  u{t,-n)  =u(t,Tv),  ux(t,-ir)=ux(t,ir),  uxx(t,  -tt)  =  uxx(t,ir), 

(8.100) 

with  initial  data  u( 0,x)  =  f{x).  The  Fourier  series  formula  for  the  resulting  solution  is 
straightforwardly  constructed: 


CO 

u(t,  x)=  Yi  ckei(kx+kh\ 

k  —  —co 


(8.101) 


where  ck  are  the  usual  (complex)  Fourier  coefficients  (3.65)  of  the  initial  data  fix). 
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Figure  8.12.  Periodic  dispersion  at  rational  (with  respect  to  7r)  times,  [dj 


Let  us  take  the  initial  data  to  be  the  unit  step  function:  n(0,  x)  =  a(x).  In  view  of  its 
Fourier  series  (3.67),  the  resulting  solution  formula  (8.101)  becomes 


,  N  1  1  \ — 'V 

“(*’*)  =  2  -  -  T 


oo 


l  —  —  oo 
oo 


1  2  f, 

“2  +  n  ^ 
1  =  0 


>  i  [  (2  /  +  l)x+(2  /  +  l)3t  ] 

21  +  1 

sin  I"  (2/  +  1)  x  +  (2/  +  l)3t 


(8.102) 


2/  +  1 


Let  us  graph  this  solution.  At  times  uniformly  spaced  by  At  =  .1,  the  resulting  solution 
profiles  are  plotted  in  Figure  8.11.  The  solution  appears  to  have  a  continuous  but  fractal- 
like  structure,  reminiscent  of  Weierstrass’  continuous  but  nowhere  differentiable  function, 
55;  pp.  401-421].  The  temporal  evolution  continues  in  this  fashion  until  the  initial  data 
are  formed  again  at  t  =  2tt,  after  which  the  process  periodically  repeats. 


However,  when  the  times  are  spaced  by  At  =  ^tt  .10472,  the  resulting  solution 

profiles,  as  plotted  in  Figure  8.12,  are  strikingly  different!  Indeed,  as  you  are  asked  to 
prove  in  Exercise  8.5.8,  at  each  rational  time  t  =  2np/q,  where  p,  q  are  integers,  the 
solution  (8.102)  to  the  initial-boundary  value  problem  is  discontinuous  but  constant  on 
subintervals  of  length  2  it / q.  This  remarkable  behavior,  in  which  the  solution  profiles  of 
linearly  dispersive  periodic  boundary  value  problems  have  markedly  different  behaviors  at 
rational  and  irrational  times  (with  respect  to  i r),  was  first  observed,  in  the  1990’s,  in  optics 
and  quantum  mechanics  by  the  British  physicist  Michael  Berry,  [16,  115],  and  named  the 
Talbot  effect ,  after  an  optical  experiment  conducted  by  the  inventor  of  the  photographic 
negative,  William  Henry  Fox  Talbot.  While  writing  this  book,  I  rediscovered  the  effect, 
which  I  like  to  call  dispersive  quantization ,  [88],  and  found  that  it  arises  in  a  wide  range 
of  linearly  dispersive  periodic  initial-boundary  value  problems,  [30]. 
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The  Dispersion  Relation 


As  noted  earlier,  a  key  feature  of  the  third-order  wave  equation  (8.90)  is  that  waves  disperse, 
in  the  sense  that  those  of  different  frequencies  move  at  different  speeds.  Our  goal  now  is 
to  better  understand  the  dispersion  process.  To  this  end,  consider  a  solution  whose  initial 
profile 

w(0,  x)  =  elkx 

is  a  complex  oscillatory  function.  Since  the  initial  data  does  not  decay  as  |  x  |  — )►  oo,  we 
cannot  use  the  Fourier  integral  solution  formula  (8.94)  directly.  Instead,  anticipating  the 
induced  wave  to  exhibit  temporal  oscillations,  let  us  try  an  exponential  solution  ansatz 

u(t,x)  =  ei(kl-wt)  (8.103) 

representing  a  complex  oscillatory  wave  of  temporal  frequency  uj  and  wave  number  (spatial 
frequency)  k.  Since 


du 


=  —iuj  e 


i  ( kx—uot ) 


d3u 


_  _  ^3  e  i  ( kx-ojt ) 


dt  ’  dx3 

(8.103)  satisfies  the  partial  differential  equation  (8.90)  if  and  only  if  its  frequency  and  wave 
number  satisfy  the  dispersion  relation 


uj  =  —k3.  (8.104) 

Therefore,  the  exponential  solution  (8.103)  of  wave  number  k  takes  the  form 

u{t,x)  =  ei{-kx+k3tl  (8.105) 

Our  Fourier  transform  formula  (8.94)  for  the  solution  can  thus  be  viewed  as  a  (continu¬ 
ous)  linear  superposition  of  these  elementary  exponential  solutions.  In  general,  to  find  the 
dispersion  relation  for  a  linear  constant-coefficient  partial  differential  equation,  one  substi¬ 
tutes  the  exponential  ansatz  (8.103).  On  cancellation  of  the  common  exponential  factors, 
the  result  is  an  equation  expressing  the  frequency  uj  as  a  function  of  the  wave  number  k. 

Any  exponential  solution  (8.103)  is  automatically  in  the  form  of  a  traveling  wave,  since 
we  can  write 

u(t,x)  =ei{kx~UJt'>  =eik{x~c^\  where  CP=J  (8.106) 

is  the  wave  speed  or,  as  it  is  more  usually  called,  the  phase  velocity.  If  the  dispersion 
relation  is  linear  in  the  wave  number,  uj  =  c/c,  as  occurs  in  the  linear  transport  equation 
ut  +  cux  =  0,  then  all  waves  move  at  an  identical  speed  cp  =  c,  and  hence  localized 
disturbances  stay  localized  as  they  propagate  through  the  medium.  In  the  dispersive  case, 
uj  is  no  longer  a  linear  function  of  fc,  and  so  waves  of  different  spatial  frequencies  move  at 
different  speeds.  In  the  particular  case  (8.90),  those  with  wave  number  k  move  at  speed 
cp  =  uj/k  =  —  /c2,  and  so  the  higher  the  wave  number,  the  faster  the  wave  propagates  to  the 
left.  As  the  individual  exponential  constituents  separate,  the  overall  effect  is  the  dispersive 
decay  of  an  initially  localized  wave,  with  slowly  diminishing  amplitude  and  increasingly 
rapid  oscillation  as  r  N  -  oo. 

The  general  solution  to  the  linear  partial  differential  equation  under  consideration  is 
then  built  up  by  linear  superposition  of  the  exponential  solutions, 

/oo 

e1  (ykx~ujt"> g(k)  dk,  (8.107) 

-oo 


8.5  Dispersion  and  Solitons 


331 


where  uj  =  uj{k)  is  determined  by  the  relevant  dispersion  relation.  While  the  evolution  of 
the  individual  waves  is  an  immediate  consequence  of  the  dispersion  relation,  the  evolution 
of  the  localized  wave  packet  represented  by  (8.107)  is  less  evident.  To  determine  its  speed 
of  propagation,  let  us  switch  to  a  moving  coordinate  frame  of  speed  c  by  setting  x  =  ct+e 
The  solution  formula  (8.107)  then  becomes 


u(t,  ct  +  £) 


*oo 


■  oo 


i  (ck-uj)t  i  k  £ 


g(k)  dk. 


(8.108) 


For  a  fixed  value  of  £,  the  integral  is  of  the  general  oscillatory  form 


‘OO 


H(t) 


e iy(fc)  *  h(k)  dk. 


(8.109) 


—  OO 


where,  in  our  case,  p>(k)  —  ck  —  uo{k)  and  h{k)  =  elk^  g(k).  We  are  interested  in  under¬ 
standing  the  behavior  of  such  an  oscillatory  integral  as  t  — ?►  oo.  Now,  if  <p(k)  =  fc,  then 
(8.109)  is  just  a  Fourier  integral,  (7.9),  and,  as  we  learned  in  Chapter  7,  H(t)  0  as 
t  oo,  for  any  reasonable  function  h(k).  Intuitively,  the  increasingly  rapid  oscillations  of 
the  exponential  factor  tend  to  cancel  each  other  out  in  the  high-frequency  limit.  A  similar 
result  holds  wherever  tp(k)  has  no  stationary  points,  i.e.,  <pf(k)  ^  0,  since  one  can  then 

perform  a  local  change  of  variables  k  =  (p(k)  to  convert  that  part  of  the  oscillatory  integral 
to  Fourier  form,  and  again  the  increasingly  rapid  oscillations  cause  the  limit  to  vanish.  In 
this  fashion,  we  arrive  at  the  key  insight  of  Stokes  and  Kelvin  that  produced  the  powerful 
Method  of  Stationary  Phase.  Namely,  for  large  t  0,  the  primary  contribution  to  the 
highly  oscillatory  integral  (8.109)  occurs  at  the  stationary  points  of  the  phase  function, 
that  is,  where  <pf(k)  =  0.  A  rigorous  justification  of  the  method,  along  with  precise  error 
bounds,  can  be  found  in  [85  . 

In  the  present  context,  the  Method  of  Stationary  Phase  implies  that  the  most  signifi¬ 
cant  contribution  to  the  integral  (8.108)  occurs  when 


d  ,  7  .  duo  .  _  . 

0  =  — -  (uj  —  ck)  =  — —  c.  (8.110) 

dk  dk 

Thus,  surprisingly,  the  principal  contribution  of  the  components  at  wave  number  k  is  felt 
when  moving  at  the  group  velocity 


duo 

°9  dk 


(8.111) 


Interestingly,  unless  the  dispersion  relation  is  linear  in  the  wave  number,  the  group  velocity 
(8.111),  which  determines  the  speed  of  propagation  of  the  energy,  is  not  the  same  as  the 
phase  velocity  (8.106),  which  governs  the  speed  of  propagation  of  an  individual  oscillatory 
wave.  For  example,  in  the  case  of  the  dispersive  wave  equation  (8.90),  uj  =  —  k3,  and  so 
cg  =  —3k2,  which  is  three  times  as  fast  as  the  phase  velocity,  cp  =  uj/k  =  —k2.  Thus,  the 
energy  propagates  faster  than  the  individual  waves.  This  can  be  observed  in  Figure  8.9: 
while  the  bulk  of  the  disturbance  is  spreading  out  rather  rapidly  to  the  left,  the  individual 
wave  crests  are  moving  slower. 

On  the  other  hand,  the  dispersion  relation  associated  with  deep  water  waves  is  (ig¬ 
noring  physical  constants)  uj  =  \[k ,  [122].  Now,  the  phase  velocity  is  cp  =  uj/k  =  1/Vk  , 

whereas  the  group  velocity  is  cg  =  duo/dk  =  l/(2y/k)  =  \cp,  and  so  the  individual  waves 
move  twice  as  fast  as  the  speed  of  propagation  of  the  underlying  wave  energy.  For  an  ex¬ 
perimental  verification,  just  throw  a  stone  in  a  still  pond.  An  individual  wave  crest  emerges 
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in  back  and  then  steadily  grows  as  it  moves  through  the  disturbance,  eventually  subsiding 
and  disappearing  into  the  still  water  ahead  of  the  expanding  wave  packet  triggered  by  the 
stone.  The  distinction  between  group  velocity  and  phase  velocity  is  also  well  understood 
by  surfers,  who  know  that  the  largest  waves  seen  out  to  sea  are  not  the  largest  when  they 
break  upon  the  shore. 


Exercises 


8.5.1.  Sketch  a  picture  of  the  solution  for  the  initial  value  problem  in  Example  8.13  at  times 
t  =  —.1,  —.5,  and  —1. 


with  initial  data  ^(0,  x)  = 


(b)  Use  a  computer  package  to  plot  your 


8.5.2.  (a)  Write  down  an  integral  formula  for  the  solution  to  the  dispersive  wave  equation  (8.90) 

1,  0  <  x  <  1, 

0,  otherwise. 

solution  at  several  times  and  discuss  what  you  observe. 

8.5.3.  (a)  Write  down  an  integral  formula  for  the  solution  to  the  initial  value  problem 


u ^  +  ux  -f-  u 


ry*  ry*  ry* 
tC  tC  tC 


0,  a(0,  x)  =  f(x). 


(b)  Based  on  the  results  in  Example  8.13,  discuss  the  behavior  of  the  solution  to  the  initial 

_  2 

value  problem  a(0,  x)  =  e  x  as  t  increases. 

8.5.4.  Find  the  (i)  dispersion  relation,  ( ii )  phase  velocity,  and  (Hi)  group  velocity  for  the 
following  partial  differential  equations.  Which  are  dispersive?  (a)  ut  +  ux  +  uxxx  —  0, 

(b)  tq  ^xxxxx1  (^)  W  ~f~  ^ X  ^ xxt  (^0  Wt  ^  Ux’  (®)  Wt  Ux  Uxxx" 

8.5.5.  Find  all  linear  evolution  equations  for  which  the  group  velocity  equals  the  phase  velocity. 
Justify  your  answer. 

8.5.6.  Show  that  the  phase  velocity  is  greater  than  the  group  velocity  if  and  only  if  the  phase 
velocity  is  a  decreasing  function  of  k  for  k  >  0  and  an  increasing  function  of  k  for  k  <  0. 
How  would  you  observe  this  in  a  physical  system? 

0  8.5.7.  (a)  Conservation  of  Mass:  Prove  that  T  =  u  is  a  density  associated  with  a  conservation 
law  of  the  dispersive  wave  equation  (8.90).  What  is  the  corresponding  flux?  Under  what 
conditions  is  total  mass  conserved?  (b)  Conservation  of  Energy:  Establish  the  same  result 

for  the  energy  density  T  =  a2,  (c)  Is  a3  the  density  of  a  conservation  law? 

0  8.5.8.  Prove  that  when  t  =  tt p/q,  where  p,q  are  integers,  the  solution  (8.102)  is  constant  on 
each  interval  i rj/q  <  x  <  tt (j  +  1  )/q  for  integers  j  E  Z.  Hint:  Use  Exercise  6.1.29(d). 
Remark:  The  proof  that  the  solution  is  continuous  and  fractal  at  irrational  times  is  consid¬ 
erably  more  difficult,  [90]. 

0  8.5.9.  (a)  Find  the  complex  Fourier  series  representing  the  fundamental  solution  F(t,x,£)  to 
the  periodic  initial-boundary  value  problem  (8.100).  (b)  Prove  that  at  time  t  =  27rp/g, 
where  p,  q  are  relatively  prime  integers,  F(t,  x  \  £)  is  a  linear  combination  of  delta  functions 
based  at  the  points  £  -\-2tt j / q.  Hint:  Use  Exercise  6.1.29(c).  (c)  Let  u(t,x)  be  any  solution 
to  (8.100).  Prove  that  u(2irp/ q,x)  is  a  linear  combination  of  a  finite  number  of  translates, 
f(x  —  Xj),  of  the  initial  data. 
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The  Korteweg-de  Vries  Equation 


The  simplest  wave  model  that  combines  dispersion  with  nonlinearity  is  the  celebrated 
Korteweg-de  Vries  equation 

ut  +  a  +  uu„  —  0.  (8.112) 

It  was  hrst  derived,  in  1872,  by  the  French  applied  mathematician  Joseph  Boussinesq,  [21; 
eq.  (30)],  [22;  eqs.  (283,  291)],  as  a  model  for  surface  waves  on  shallow  water.  Two  decades 
later,  it  was  rediscovered  by  the  Dutch  applied  mathematician  Diederik  Korteweg  and  his 
student  Gustav  de  Vries,  [65],  and,  despite  Boussinesq’s  priority,  it  is  nowadays  named 
after  them.  In  the  early  1960s,  the  American  mathematical  physicists  Martin  Kruskal  and 
Norman  Zabusky,  [125],  used  the  Korteweg-de  Vries  equation  as  a  continuum  model  for 
a  one-dimensional  chain  of  masses  interconnected  by  nonlinear  springs:  the  Fermi-Past a- 
Ulam  problem,  [40].  Numerical  experimentation  revealed  its  many  remarkable  properties, 
which  were  soon  rigorously  established.  Their  work  sparked  the  rapid  development  of  one 
of  the  most  remarkable  and  far-reaching  discoveries  of  the  modern  era:  integrable  nonlinear 
partial  differential  equations,  [2,36]. 

The  most  important  special  solutions  to  the  Korteweg-de  Vries  equation  are  the  trav¬ 
eling  waves.  We  seek  solutions 


u  =  v(£)  =  v(x  —  ct),  where  £  =  x  —  ct, 

that  have  a  fixed  profile  while  moving  with  speed  c.  By  the  chain  rule, 


du 

~dt 


Substituting  these  expressions  into  the  Korteweg-de  Vries  equation  (8.112),  we  conclude 
that  v(^)  must  satisfy  the  nonlinear  third-order  ordinary  differential  equation 


v'"  +  vv'  -  cv'  =  0.  (8.113) 

Let  us  further  assume  that  the  traveling  wave  is  localized ,  meaning  that  the  solution  and 
its  derivatives  are  vanishingly  small  at  large  distances: 


du  d2u 

lim  u(t,x)  =  lim  —  (t.  x)  =  lim  7—7  (t.  x)  =  0. 

x— >  =t  00  x— )>d=oo  dx  x  — >  d=  00  dxZ 

This  implies  that  we  should  impose  the  boundary  conditions 


lim  -i?^)  =  lim  i/(£)  =  lim  vn { £)  =  0. 

£  — >  ±  00  £  — >  ±  00  £  — »  d=  00 


(8.114) 


(8.115) 


The  ordinary  differential  equation  (8.113)  can,  in  fact,  be  solved  in  closed  form.  First, 
note  that  it  has  the  form 


\v2  —  cv)  =  0, 


1 1  "1  9 

and  hence  v  +  —  cv  =  a, 


where  a  indicates  the  constant  of  integration.  The  localizing  boundary  conditions  (8.115) 
imply  that  a  =  0.  Multiplying  the  resulting  equation  by  v'  allows  us  to  integrate  a  second 
time: 


0  =  vr  (yn  +  \  v2  —  cv) 


d 

d£ 


3  W)2  +  \^ 
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Figure  8.13. 


Solitary  wave/soliton. 


Thus, 

i  {v'f  +  lv*-lcv2  =  b, 

where  6  is  a  second  constant  of  integration,  which,  again  by  the  boundary  conditions 
(8.115),  is  also  zero.  Setting  6  =  0,  and  solving  for  V,  we  conclude  that  u(£)  satisfies  the 
autonomous  first-order  ordinary  differential  equation 

dv  /  ] 

^="VC-3^ 

which  is  integrated  by  the  standard  method: 


—  £  +  <5, 


where  5  is  constant.  Consulting  a  table  of  integrals,  e.g.,  [48],  and  then  solving  for  u,  we 
conclude  that  the  solution  has  the  form 


=  3  c  sech2(  +  6),  (8.116) 

where 

sech  ii  =  - - —  =  - 

cosh  y  ey  +  e  y 

is  the  hyperbolic  secant  function.  The  solution  has  the  form  graphed  in  Figure  8.13.  It  is 
a  symmetric,  monotone,  exponentially  decreasing  function  on  either  side  of  its  maximum 
height  of  3  c.  (Despite  its  suggestive  profile,  it  is  not  a  Gaussian.)  The  resulting  localized 
traveling- wave  solutions  to  the  Korteweg-de  Vries  equation  are  thus 


u(t,  x)  =  3c  seen  —  ct) -\- 5] 


(8.117) 


where  c  >  0  represents  the  wave  speed  —  which  is  necessarily  positive,  and  so  all  such 
solutions  move  to  the  right  —  while  S  represents  an  overall  phase  shift.  The  amplitude  of 
the  wave  is  three  times  its  speed,  while  its  width  is  proportional  to  1  /yV.  Thus,  the  taller 
(and  narrower)  the  wave,  the  faster  it  moves. 

Localized  traveling  waves  are  commonly  known  as  solitary  waves.  They  were  first 
observed  in  nature  by  the  British  engineer  J.  Scott  Russell,  [104],  who  recounts  how  one  was 
triggered  by  the  sudden  motion  of  a  barge  along  an  Edinburgh  canal.  Scott  Russell  ended 
up  chasing  the  propagating  wave  on  horseback  for  several  miles  —  a  physical  indication 
of  its  stability.  Russell’s  observations  were  dismissed  by  his  contemporary  Airy,  who, 
relying  on  his  linearly  dispersive  model  for  surface  waves  (8.90),  claimed  that  such  localized 
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Figure  8.14. 


Interaction  of  two  solitons. 


disturbances  could  not  exist.  Much  later,  Boussinesq  derived  the  proper  nonlinear  surface 
wave  model  (8.112),  valid  for  long  waves  in  shallow  water,  along  with  its  solitary  wave 
solutions  (8.117),  thereby  fully  exonerating  Russell’s  physical  observations  and  insight. 

It  took  almost  a  century  before  all  the  remarkable  properties  of  these  solutions  came 
to  light.  The  most  striking  is  how  two  such  solitary  waves  interact.  While  linear  equations 
always  admit  a  superposition  principle,  one  cannot  naively  combine  two  solutions  to  a 
nonlinear  equation.  However,  in  the  case  of  the  Korteweg-de  Vries  equation,  suppose  the 
initial  data  represent  a  taller  solitary  wave  to  the  left  of  a  shorter  one.  As  time  evolves, 
the  taller  wave  will  move  faster,  and  eventually  catch  up  to  the  shorter  one.  They  then 
experience  a  complicated  nonlinear  interaction,  as  expected.  But,  remarkably,  after  a 
while,  they  emerge  from  the  interaction  unscathed!  The  smaller  wave  is  now  in  back  and 
the  larger  one  in  front,  and  both  unchanged  in  speed,  amplitude,  and  profile.  They  then 
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proceed  independently,  with  the  smaller  solitary  wave  lagging  farther  and  farther  behind 
the  faster,  taller  wave.  The  only  effect  of  their  encounter  is  an  overall  phase  shift,  so  that 
the  taller  wave  is  a  bit  behind  where  it  would  be  if  it  had  not  encountered  the  shorter 
wave,  while  the  shorter  wave  is  a  little  ahead  of  its  unhindered  position.  Figure  8.14  plots 
a  typical  such  interaction. 

Owing  to  this  “particle-like”  behavior  under  interaction,  these  solutions  were  given 
a  special  name:  soliton.  An  explicit  formula  for  a  two-soliton  solution  to  the  Korteweg- 
de  Vries  equation  can  be  written  in  the  following  form: 


d2 

u{t,  x )  =  12  -—r  log  A (£,  x), 


dx 2 


(8.118) 


where 


A  (£,  x)  =  det 


(  1  +  s1(t,x) 


2  h 


b  i  T  br 


e2{t,x) 


\ 


2  b 


\bT+b:£^x)  1  +  £2Xx)  y 


(8.119) 


where  0  <  6-,  <  62,  and 


J  =  1,2. 


(8.120) 


£j  (£,  x)  =  exp  [  bj  (x  —  bj  t)  +  d- 

The  constants  c  •  =  b2  represent  the  wave  speeds,  while  the  d  ■  correspond  to  phase  shifts  of 
the  individual  solitons.  Proving  that  (8.118)  is  indeed  a  solution  to  the  Korteweg-de  Vries 
equation  is  a  straightforward,  albeit  tedious,  exercise  in  differentiation.  In  Exercise  8.5.14, 
the  reader  is  asked  to  investigate  its  asymptotic  behavior,  as  t  ->  Too,  and  prove  that  the 
solution  does,  indeed,  break  up  into  two  solitons,  having  the  same  profiles,  speeds,  and 
amplitudes  in  both  the  distant  past  and  future. 

A  similar  dynamic  occurs  when  there  are  multiple  collisions  among  solitons.  Faster 
solitons  catch  up  to  slower  ones  moving  to  their  right.  After  the  various  solitons  finish 
colliding  and  interacting,  they  emerge  in  order,  from  smallest  to  largest,  each  moving  at 
its  characteristic  speed  and  becoming  more  and  more  separated  from  its  peers.  An  explicit 
formula  for  the  n-soliton  solution  is  provided  by  the  same  logarithmic  derivative  (8.118)  in 
which  A (£,  x)  now  represents  the  determinant  of  an  n  x  n  matrix  whose  ith  diagonal  entry 

is  1  +  £.(t,x),  while  the  off-diagonal  (i,  j)  entry,  i  ^  j,  is  e  .(t,x),  using  the  same 

bi  +  bj  j 

formula  (8.120)  for  the  £  -’s,  and  where  0  <  b±  <  •  •  •  <  bn  correspond  to  the  n  different 
soliton  wave  speeds  c-  =  b2.  Furthermore,  it  can  be  shown  that,  starting  with  an  arbitrary 
localized  initial  disturbance  u{ 0,  x)  =  f(x)  that  decays  sufficiently  rapidly  as  |  x  \  — oo,  the 
resulting  solution  eventually  emits  a  finite  number  of  solitons  of  different  heights,  moving 
off  at  their  respective  speeds  to  the  right,  and  so  arranged  in  order  from  smallest  to  largest, 
followed  by  a  small,  asymptotically  self-similar  dispersive  tail  that  gradually  disappears. 

The  source  of  these  highly  non-obvious  facts  and  formulas  lies  beyond  the  scope  of 
this  introductory  text.  Soon  after  the  initial  numerical  studies,  Gardner,  Green,  Kruskal, 
and  Miura,  [45],  discovered  a  profound  connection  between  the  solutions  to  the  Korteweg- 
de  Vries  equation  and  the  eigenvalues  A  of  the  Sturm-Liouville  boundary  value  problem 


d2i(j 

dx2 


+  6u(t,  x)  ^  =  A  t/q  —  OO  <  £  <  oo. 


with 


^(t,  x)  — >  0  as 


x 


->  oo. 


(8.121) 

Their  remarkable  result  is  that  whenever  u(t,  x)  is  a  localized  solution  to  the  Korteweg- 
de  Vries  equation  (8.112),  the  eigenvalues  of  (8.121)  are  constant,  meaning  that  they  do  not 
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vary  with  the  time  £,  while  the  continuous  spectrum  has  a  very  simple  temporal  evolution. 
In  physical  applications  of  the  stationary  Schrodinger  equation  (8.121),  in  which  u(t,x) 
represents  a  quantum-mechanical  potential,  the  eigenvalues  correspond  to  bound  states, 
while  the  continuous  spectrum  governs  its  scattering  behavior.  The  solution  to  the  so- 
called  inverse  scattering  problem  reconstructs  the  potential  u(t,  x )  from  its  spectrum,  and 
can  be  viewed  as  a  nonlinear  version  of  the  Fourier  transform,  in  that  it  effectively  linearizes 
the  Korteweg-de  Vries  equation  and  thereby  reveals  its  many  remarkable  properties.  In 
particular,  the  eigenvalues  are  responsible  for  the  preceding  determinantal  formulae  for  the 
multi-soliton  solutions,  while,  when  present,  the  continuous  spectrum  governs  the  dispersive 
tail.  See  [2,36]  for  additional  details. 


Exercises 


8.5.10.  Justify  the  statement  that  the  width  of  a  soliton  is  proportional  to  the  inverse  of  the 
square  root  of  its  speed. 

8.5.11.  Prove  that  the  function  (8.116)  is  a  symmetric,  monotone,  exponentially  decreasing 
function  on  either  side  of  its  maximum  height  of  3  c. 


8.5.12.  Let  u(t,x)  solve  the  Korteweg-de  Vries  equation. 

(a)  Show  that  U (£,  x)  =  u(t,  x  —  ct)  +  c  is  also  a  solution. 

( b )  Give  a  physical  interpretation  of  this  symmetry. 

8.5.13.  (a)  Find  all  scaling  symmetries  of  the  Korteweg-de  Vries  equation. 

(b)  Write  down  an  ansatz  for  the  similarity  solutions,  and  then  find  the  corresponding 
reduced  ordinary  differential  equation.  (Unfortunately,  the  similarity  solutions  cannot 
be  written  in  terms  of  elementary  functions,  [2].) 


T  8.5.14.  (a)  Let  u(t,x)  be  the  two-soliton  solution  defined  in  (8.118).  Let  u(t,£)  =  u(t,£  +  ct) 
represent  the  solution  as  viewed  in  a  coordinate  frame  moving  with  speed  c.  Prove  that 


lim 

t  — >  oo 


3  c1  sech2 

2  ^  T  ^1 

<  3  c2  sech2 

_  \  V^2  £  +  ^2  . 

,  o, 

c  —  c 


1> 


c 


Cr 


'2  ’ 

otherwise, 

for  suitable  constants  S2-  Explain  why  this  justifies  the  statement  that  the  solution 
indeed  breaks  up  into  two  individual  solitons  as  t  — oo.  (b)  Explain  why  Tt(£,  £)  has  a 


similar  limiting  behavior  as  t  — >>  —  oo,  but  with  possibly  different  constants  <51?  S2- 

(c)  Use  your  formulas  to  discuss  how  the  solitons  are  affected  by  the  collision. 


8.5.15.  Let  a,  (3  0.  Find  the  soliton  solutions  to  the  rescaled  Korteweg-de  Vries  equation 

ut  +  +  f3uux  =0.  How  are  their  speed,  amplitude,  and  width  interrelated? 

8.5.16.  (a)  Find  the  solitary  wave  solutions  to  the  modified  Korteweg-de  Vries  equation 

ut  +  uXXx  +  u<2ux  —  0-  (b)  Discuss  how  the  amplitude  and  width  of  a  solitary  wave  is 
related  to  its  speed.  Note :  The  modified  Korteweg-de  Vries  equation  is  also  integrable,  and 
its  solitary  wave  solutions  are  solitons,  cf.  [36], 

8.5.17.  Answer  Exercise  8.5.16  for  the  Benjamin-Bona-Mahony  equation  ut  —  uxxt  +  uux  =  0, 

[14].  Note :  The  BBM  equation  is  not  integrable,  and  collisions  between  its  solitary  waves 
produce  a  small,  but  measurable,  inelastic  effect,  [1]. 

0  8.5.18.  (a)  Show  that  T1  =  u  is  the  density  for  a  conservation  law  for  the  Korteweg-de  Vries 

equation,  (b)  Show  that  T2  =  u 2  is  also  a  conserved  density,  (c)  Find  a  conserved  density 

o  o 

of  the  form  T3  =  ux  +  pu  for  a  suitable  constant  p.  Remark :  The  Korteweg-de  Vries 
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equation  in  fact  has  infinitely  many  conservation  laws,  whose  densities  depend  on  higher 
and  higher-order  derivatives  of  the  solution,  [76,  87].  It  was  this  discovery  that  unlocked 
the  door  to  all  its  remarkable  integr ability  properties,  [2,36]. 

8.5.19.  Find  two  conservation  laws  of 

(a)  the  modified  Korteweg-de  Vries  equation  ut  +  uxxx  +  u2ux  =0; 

(b)  the  Benjamin-Bona-Mahony  equation  ut  —  uxxt  +  uux  =0. 


Chapter  9 

A  General  Framework  for 

Linear  Partial  Differential  Equations 


Before  pressing  on  to  the  higher-dimensional  manifestations  of  the  heat,  wave,  and 
Laplace/  Poisson  equations,  it  is  worth  pausing  to  develop  a  general,  abstract,  linear- 
algebraic  framework  that  underlies  many  of  the  linear  partial  differential  equations  arising 
throughout  the  subject  and  its  applications.  The  power  of  mathematical  abstraction  is 
that  concentrating  on  the  essential  features  and  not  being  distracted  by  the  at  times  messy 
particular  details  enables  one  to  establish,  relatively  painlessly,  very  general  results  that  can 
be  applied  throughout  the  subject  and  beyond.  Each  abstract  concept  has,  as  its  source, 
an  elementary  finite-dimensional  version  valid  for  linear  algebraic  systems  and  matrices, 
which  is  then  generalized  and  extended  to  include  linear  boundary  value  problems  and 
then  initial-boundary  value  problems  governed  by  differential  equations.  All  of  the  abstract 
definitions  and  results  contained  here  will  be  immediately  applicable  to  the  boundary  and 
initial  value  problems  of  physical  interest,  and  serve  to  deepen  our  understanding  of  the 
underlying  commonalities  among  systems  and  solution  techniques.  Nevertheless,  a  more 
applications-oriented  reader  may  prefer  to  skip  ahead  to  the  more  concrete  developments 
contained  in  the  following  chapters,  referring  to  the  background  material  presented  here  as 
necessary. 

Most  equilibrium  systems  are  modeled  as  boundary  value  problems  involving  a  linear 
differential  operator  that  satisfies  the  two  key  conditions  of  being  “self-adjoint”  and  either 
“positive  definite”  or,  slightly  more  generally,  “positive  semi-definite”.  So,  our  first  task 
is  to  introduce  the  adjoint  of  a  linear  function  in  general,  and,  for  our  specific  purposes,  a 
linear  differential  operator.  The  adjoint  is  a  far-reaching  generalization  of  the  elementary 
matrix  transpose.  Its  formulation  relies  on  the  specification  of  inner  products  on  both  the 
domain  and  target  spaces  of  the  operator,  and,  when  one  is  dealing  with  linear  differential 
operators,  the  imposition  of  suitable  homogeneous  boundary  conditions  on  the  spaces  of 
allowable  functions.  In  applications,  the  relevant  inner  products  are  typically  dictated 
by  the  underlying  physics.  One  immediate  application  of  the  adjoint  is  the  Fredholm 
Alternative,  which  delineates  the  constraints  required  for  the  existence  of  solutions  to 
linear  systems,  including  linear  boundary  value  problems. 

A  linear  operator  that  equals  its  own  adjoint  is  called  self-adjoint.  The  simplest  exam¬ 
ple  is  the  linear  function  defined  by  a  symmetric  matrix.  The  most  important  subclasses 
are  the  positive  definite  and  positive  semi-definite  operators,  which  are  the  natural  ana¬ 
logues  of  positive  (semi-) definite  matrices.  We  will  learn  how  to  construct  self-adjoint 
positive  (semi-) definite  operators  in  a  canonical  manner.  Almost  all  of  the  linear  differen¬ 
tial  operators  studied  in  this  text,  including  the  Laplacian,  are,  when  subject  to  suitable 
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boundary  conditions,  self-adjoint  and  either  positive  definite  or  positive  semi-definite.  The 
key  distinction  is  that  positive  definite  linear  systems  and  boundary  value  problems  admit 
unique  solutions,  whereas  in  the  positive  semi-definite  case,  the  solution  either  does  not 
exist,  since  the  Fredholm  constraints  are  not  satisfied,  or,  when  it  exists,  is  not  unique.  In 
their  dynamical  manifestations,  positive  definite  operators  induce  stable  vibrational  sys¬ 
tems,  whereas  the  positive  semi-definite  cases  contain  unstable  modes  that  can  lead  to 
disastrous  physical  consequences. 

A  critically  important  fact  is  that  the  solution  to  a  positive  definite  linear  system 
can  be  characterized  by  a  minimization  principle,  provided  by  a  certain  quadratic  function 
or,  in  the  infinite-dimensional  function-space  version,  quadratic  functional.  In  physical 
contexts,  the  function(al)  often  represents  the  potential  energy  of  the  system,  and  the 
solution  minimizes  said  energy  among  all  possible  configurations  satisfying  the  prescribed 
boundary  conditions,  thereby  quantifying  the  maxim  that  Nature  is  inherently  conservative 
and  seeks  to  minimize  energy.  In  mathematics,  minimization  principles  underlie  advanced 
functional- analytic  methods  used  to  establish  existence  theorems,  as  well  as  the  finite 
element  numerical  schemes  to  be  presented  in  Chapter  10. 

For  linear  dynamical  systems  like  the  heat  and  wave  equations,  separation  of  variables 
leads  to  an  eigenvalue  problem  for  the  linear  differential  operator  governing  the  corre¬ 
sponding  equilibrium  system.  In  the  simple  one-dimensional  cases  discussed  in  Chapter  4, 
the  eigenfunctions  are  trigonometric,  producing  the  classical  Fourier  expansions  for  the 
solutions.  The  effectuality  of  the  Fourier  method  relies  on  the  eigenfunctions’  orthogonal¬ 
ity,  and  we  already  hinted  that  this  is  no  accident.  Rather,  it  is  a  consequence  of  their 
status  as  the  eigenfunctions  of  a  self-adjoint  linear  operator.  Not  only  are  such  eigenfunc¬ 
tions  automatically  mutually  orthogonal  with  respect  to  the  underlying  inner  product,  the 
eigenvalues  are  necessarily  real  and,  when  the  operator  is  positive  definite,  also  positive. 

Orthogonality  underlies  the  Fourier-like  expansion  of  quite  general  functions  as  series 
in  the  eigenfunctions,  whose  convergence,  in  general,  requires  that  the  eigenfunctions  form 
a  complete  system.  For  positive  definite  boundary  value  problems  on  bounded  domains, 
we  will  establish  completeness  by  combining  the  eigenfunction  expansion  for  the  associ¬ 
ated  Green’s  function  with  a  basic  minimization  principle  for  the  eigenvalues  based  on  the 
Rayleigh  quotient.  On  the  other  hand,  problems  on  unbounded  domains  do  not  typically 
admit  complete  systems  of  eigenfunctions  and  require  the  more  advanced  analytical  con¬ 
cepts  of  continuous  spectrum  and  generalized  Fourier  transforms  that  lie  beyond  the  scope 
of  this  text. 

The  chapter  concludes  by  describing  a  general  framework  for  dynamics  that  produces 
time-dependent  series  solutions,  in  terms  of  the  eigenfunctions  of  the  underlying  equilibrium 
operator,  for  diffusion  equations,  vibration  equations,  and  quantum-mechanical  systems. 
The  final  two  chapters  will  then  specialize  these  general  theories  and  constructions  to 
analyze  initial-boundary  value  problems  for  the  two-  and  three-dimensional  heat,  wave, 
and  Schrodinger  equations  in  simple  geometries.  More  advanced  developments  and  further 
applications  can  be  found  in  higher-level  texts,  including  [35,  38,  44,  61,  99]. 

9.1  Adjoints 

Our  starting  point  is  a  linear  operator 


L  :  U  — ►  V 


(9.1) 


9.1  Adjoint  s 


341 


that  maps  a  vector  space  U  to  another  vector  space  V.  For  most  of  the  development, 
we  deal  with  real  vector  spaces,  although  the  final  discussion  of  the  Schrodinger  equation 
requires  us  to  venture  into  the  complex  realm.  For  our  purposes,  L  represents  a  linear 
differential  operator,  and  the  elements  of  the  domain  space  U  and  the  target  space  V 
are  suitable  scalar-  or  vector- valued  functions.  In  elastomechanics,  the  elements  of  U  are 
displacements  of  a  deformable  body,  while  the  elements  of  V  are  the  associated  strains.  In 
electromagnetism  and  gravitation,  elements  of  U  represent  potentials,  and  elements  of  V 
are  electric  or  magnetic  or  gravitational  fields.  In  thermodynamics,  U  contains  temperature 
distributions,  while  V  contains  temperature  gradients.  In  fluid  mechanics,  U  is  the  space 
of  potential  functions,  while  V  is  the  space  of  fluid  velocities.  And  so  on. 

The  abstract  definition  of  the  adjoint  of  a  linear  operator  relies  on  an  inner  product 
structure  on  both  its  domain  and  target  spaces.  We  distinguish  the  inner  products  on  U 
and  V  (which  may  be  different  even  when  U  and  V  happen  to  be  the  same  vector  space) 
by  using  a  single  angle  bracket 

(u  ,u)  to  denote  the  inner  product  between  u,  u  E  [/, 
and  a  double  angle  bracket 


v  ,  v ))  to  denote  the  inner  product  between  v,v  E  V. 


In  applications,  the  appropriate  inner  products  are  often  based  on  the  underlying  physics. 

Definition  9.1.  Let  [/,  V  be  inner  product  spaces,  and  let  L:U  -E  V  be  a  linear 
operator.  The  adjoint  of  L  is  the  unique  linear  operator  L*:V  -E  U  that  satisfies 


((L[u],v))  =  (u,L 


*  r 


V 


for  all 


ueU,  v  e  V. 


(9.2) 


Observe  that  the  adjoint  goes  in  the  reverse  direction,  that  is,  from  V  back  to  U .  To 
master  the  definition,  let  us  first  look  at  the  finite-dimensional  case. 


Example  9.2.  According  to  Theorem  B.33,  every  linear  function  L:  IRn  -E  IRm  is 
given  by  matrix  multiplication,  so  that  L[u]  =  A u  for  u  E  IRn,  where  A  is  an  m  x  n 
matrix.  The  adjoint  function  L*:IRm  -E  Mn  is  also  linear,  so  it  is  also  represented  by 
matrix  multiplication,  L*[v]  =  A*v  for  v  E  Mm,  by  an  n  x  m  matrix  A*. 

Suppose  first  that  we  impose  the  ordinary  Euclidean  dot  products 


(  u  ,  u )  =  u  •  u  =  uT  u,  u,  u  E  Mn, 


(( v  ,  v  ))  =  v  •  v  =  vT  v,  v,  v  E  Mm, 


as  our  inner  products  on  both  IRn  and  Mm.  Evaluation  of  both  sides  of  the  adjoint  identity 
(9.2)  yields 


((L[u]  ,v 
u ,  L*\v 


«iu,v))  =  (iu)rv  =  utAt  v, 
( u ,  A*  v )  =  u T A*  v. 


Since  these  expressions  must  agree  for  all  u,v,  we  conclude  (see  Exercise  9.1.6)  that  the 
matrix  A*  representing  L*  is  equal  to  the  transposed  matrix  AT.  Therefore,  the  adjoint 
of  a  matrix  with  respect  to  the  Euclidean  dot  product  is  its  transpose :  A*  =  AT.  So  one 
can  regard  the  adjoint  as  a  vast  generalization  of  the  elementary  operation  of  transposing 
a  matrix. 

More  generally,  suppose  we  take  weighted  inner  products  on  the  domain  and  target 
spaces: 


u 


u )  =  u T  M  u 


u,  u 


E  M 


n 


v,v 


=  vTCv 


V,  V 


G  R 


rn 


(9.4) 
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where  M  and  C  are  symmetric,  positive  definite  matrices  of  respective  sizes  n  x  n  and 
m  x  m,  cf.  Proposition  B.13.  Then,  repeating  the  previous  calculation  (9.3),  we  find 


(( L[u] ,  v ))  =  (( An ,  v ))  =  (Au)TC  v  =  u 1  A1  Cv, 


.T  aT, 


u ,  L 


*  r 


V 


=  (u,A*v)  =  iTAOTv. 


T 


* 


(9.5) 


Comparing  these  expressions,  we  conclude  that  the  weighted  adjoint  matrix  is 


A*  =  M~1AtC. 


(9.6) 


Therefore,  the  adjoint  does  indeed  depend  on  which  inner  products  are  being  used  on  both 
the  domain  and  target  spaces. 


Differential  Operators 


For  applications  to  linear  differential  equations,  our  attention  is  focused  on  adjoints  of 
differential  operators  defined  on  infinite-dimensional  function  spaces.  Let  us  begin  with 
the  simplest  example. 

Example  9.3.  Consider  the  derivative  v  =  D[u ]  =  du/dx ,  which  defines  a  linear 
operator  D  :U  — )►  V  mapping  a  vector  space  U  of  differentiable  functions  u{x)  to  a  vector 
space  containing  their  derivatives  v{x)  =  uf(x ).  We  assume  that  the  functions  in  question 
are  defined  on  a  fixed  bounded  interval  a  <  x  <  b. 

In  order  to  compute  its  adjoint,  we  need  to  impose  inner  products  on  both  the  domain 
space  U  and  the  target  space  V.  The  simplest  context  is  to  adopt  the  standard  L2  inner 
product  on  both: 


u ,  u 


u(x)  u(x)  dx. 


V  ,  V 


v{x)  v(x)  dx. 


(9.7) 


a 


a 


According  to  the  defining  equation  (9.2),  the  adjoint  operator  D*  :  V  — U  must  satisfy  the 
inner  product  identity 


D[u\  ,v))  =  (u,D 


First,  we  compute  the  left-hand  side: 


*  r 


V 


for  all 


u  £  £7,  v  E  V. 


(9.8) 


D[u],v))  = 


du 

dx 


))  = 


a 


du 

dx 


v  dx. 


(9.9) 


On  the  other  hand,  the  right-hand  side  should  equal 


(u,D 


*  r 


V 


=  /  uD 


*  r 


V 


dx. 


(9.10) 


a 


Now,  in  the  latter  integral,  we  see  u  multiplying  the  result  of  applying  the  linear  operator 
D*  to  v.  To  identify  this  integrand  with  that  in  (9.9),  we  need  to  somehow  remove  the 
derivative  from  u.  The  secret  is  integration  by  parts ,  which  allows  us  to  rewrite  the  first 
integral  in  the  form 


a 


du 

dx 


v  dx  —  u(b)  v(b)  —  u(a)  v(a) 


dv 

—  I  u  ——  dx. 


a 


dx 


(9.11) 
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Ignoring  the  two  boundary  terms  for  a  moment,  we  observe  that  the  remaining  integral 
has  the  form  of  an  inner  product 


fb  dv  , 

fh 

dv 

/  u  ——  dx  = 

/  u 

la  dx 

J  a 

dx 

dx  =  (  u ,  — 


dv 

dx 


(u,-D[v}) 


(9.12) 


Equating  (9.9)  and  (9.12),  we  deduce  that 

du 


((D[u],v 


dv 


v 


u ,  — 


(u,-D[v}) 


dx  //  \  dx 

Thus,  to  satisfy  the  adjoint  equation  (9.8),  we  must  have 

( u  ,  D*[v] }  =  ( u  ,  —  D[v] )  for  all  u  G  [/,  v  G  V, 
and  so  the  adjoint  of  the  derivative  operator  is  its  negative: 

D*  =  —D. 


(9.13) 


However,  the  preceding  argument  is  valid  only  if  the  boundary  terms  in  the  integration 
by  parts  formula  (9.11)  vanish: 


u(b)  v(b)  —  u(a)  v(a)  =  0. 


(9.14) 


which  necessitates  imposing  suitable  boundary  conditions  on  the  functions  u  and  v.  For 
example,  imposing  Dirichlet  boundary  conditions 


u(a)  =  0, 


u(b)  =  0. 


(9.15) 


will  ensure  that  (9.14)  holds,  and  therefore  validates  (9.13).  In  this  case,  the  domain  space 
oiD:U  -G  V  is  the  vector  space 

U  =  {  u{x)  |  u(a)  =  u(b)  =  0  }  , 

while  no  boundary  conditions  need  be  imposed  on  the  functions  v{x)  in  the  target  space 
V.  An  evident  alternative  is  to  require  that  v(a)  =  v(b)  =  0.  In  this  case,  the  target  space 

V  =  {  v{x)  |  v(a)  =  v{b)  =  0  } 

consists  of  all  functions  that  vanish  at  the  endpoints.  Since  the  derivative  D:U  — V  is 
required  to  map  a  function  u(x)  G  U  to  an  allowable  function  v{x)  G  V,  the  domain  space 
now  consists  of  functions  satisfying  the  Neumann  boundary  conditions: 

U  =  {  u(x)  |  u'(a)  =  u'(b )  =  0  }  . 

These  are  evidently  not  the  only  two  possibilities.  Let  us  list  the  most  important  combina¬ 
tions  of  boundary  conditions  that  imply  the  vanishing  of  the  boundary  terms  (9.14),  and 
so  ensure  the  validity  of  the  adjoint  equation  (9.13): 


(a)  Dirichlet  boundary  conditions: 

(b)  Mixed  boundary  conditions: 

(c)  Neumann  boundary  conditions: 

(d)  Periodic  boundary  conditions: 


u(a)  =  u(b)  =  0. 

u(a)  =  u'(b)  =  0,  or  u\a )  =  u(b)  =  0. 

u'(a )  =  u'(b)  =  0. 

u(a)  =  u(b)  and  u'(a )  =  u'(b). 


In  all  cases,  the  boundary  conditions  impose  restrictions  on  the  domain  space  U  and,  in 
cases  (b-d)  when  we  are  identifying  v{x)  =  u'(x),  the  target  space  V  also. 
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Remark :  In  the  preceding  discussion,  we  were  purposely  vague  about  the  required 
differentiability  of  the  functions.  In  finite  dimensions,  every  linear  function  L:IRn  -E  IRm 
is  given  by  matrix  multiplication  L[u]  =  A u,  and  hence  is  defined  on  all  of  the  underlying 
vector  space  IRn.  Linear  operators  on  infinite-dimensional  function  spaces  are  typically  not 
defined  on  all  possible  functions.  For  example,  the  derivative  operator  L  =  D:U  -E  V 
requires  the  function  u  E  U  to  be  differentiable.  However,  the  target  function  v  =  D[u  = 
u'  is  not  necessarily  as  smooth,  and  so  may  belong  to  a  different  function  space;  for  instance 
if  u  E  C1[a,6],  then  v  =  u!  E  C°[a,  b].  On  the  other  hand,  the  adjoint  14*  —  —D  is  defined 
only  on  differentiable  functions  v,  so  if  v  E  C  1[a,6],  then  u  =  —v'  E  C °[a,  &].  Keeping  a 
detailed  account  of  the  various  smoothness  requirements  quickly  becomes  distracting. 

To  circumvent  this  technical  annoyance,  we  will  always  deal  with  a  fixed  class  of  func¬ 
tions,  e.g.,  continuous  functions  or,  more  generally,  L2  functions,  that  are  constrained  only 
by  the  imposed  boundary  conditions.  When  we  write  L:U  -E  V,  we  allow  the  possibil¬ 
ity  that  the  linear  operator  L  may  be  defined  only  on  a  “dense”  subspace  of  the  domain 
space  U.  For  instance,  we  will  write  D:U  -E  V  with  U  —  V  —  C°[a,  6],  even  though 
D[u]  =  u!  E  V  only  if  u  belongs  to  the  dense  subspace  C1[a,6]  C  U  —  C°[a,  b].  Similarly, 
14*:  V  — >  U  is  also  dehned  only  on  the  dense  subspace  C  1[a,b]  CV  =  C°[a,  6].  The  term 
dense  refers  to  the  fact  that  any  continuous  function  in  the  full  space  U  =  C°[a,  6]  can 
be  arbitrarily  closely  approximated  in  norm  by  a  continuously  differentiable  function  in 
the  subspace  C  1[a,  b].  Or,  to  put  it  another  way,  given  a  continuous  function  u  E  C°[a,  6], 
there  exists  a  sequence  of  continuously  differentiable  functions  u1:  u2,  u3, . . .  E  C1  [a,  b]  such 
that  \\uk  —  ,u||^0as/c^oo.  A  similar  density  result  can  be  proved  for  U  =  L2[a,  &];  see 


37,  96,  98]  for  details. 


Warning :  In  more  advanced  treatments,  our  notion  of  adjoint  is  usually  called  the 
formal  adjoint.  A  true  adjoint  requires  more  subtle  technical  hypotheses  on  the  operator 
and  its  domain,  cf.  [95]. 

Example  9.4.  Let  us  recompute  the  adjoint  of  the  derivative  operator  D:  U  -E  V, 
this  time  with  respect  to  the  weighted  L2  inner  products 

nb  nb 

u  ,  S)  =  /  u(x)  u(x)  p(x)  dx ,  ((v  ,v})  =  /  v(x)v(x)K,(x)dx,  (9.16) 

J  a  J  a 

where  p(x)  >  0  and  k(x)  >  0  are  strictly  positive  functions  that,  physically,  might  represent 
the  density  and  stiffness  of  a  nonuniform  bar.  Now  we  need  to  compare 


D[u],v))  = 


du 

dx 


v(x)K,(x)dx,  with  (u,D*[v])=  /  u(x)  D*[v]  p(x)  dx 


i* 


Integrating  the  first  expression  by  parts,  we  obtain 


du 

dx 


vndx—  u{b)  v(b)  n(b)  —  u(a)  v(a)  n{a) 


d(nv) 

—  I  u  — - —  dx 

dx 


1  d(nv)\ 

u  | - — - —  p  dx 

a  \  P  dx  J 


(9.17) 


provided  that  we  select  our  boundary  conditions  so  that 


u(b)  v(b)  K,(b)  —  u(a )  v[a)  n{a)  =  0. 


(9.18) 
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As  you  can  check,  this  follows  from  any  of  the  listed  boundary  conditions:  Dirichlet, 
Neumann,  or  mixed,  as  well  as  periodic,  provided  ft  (a)  =  ft(6).  We  conclude  that,  in  such 
situations,  the  weighted  adjoint  of  the  derivative  operator  D  is  the  differential  operator 


D*[v(x)]  =  — 


d 


p{x)  dx 


ft(x)  v(x) 


ft(x)  dv  ft'(x) 
p(x)  dx  p{x) 


v{x) 


(9.19) 


As  with  matrices,  the  adjoint  of  a  differential  operator  depends  crucially  on  the  specification 
of  inner  products. 


The  following  basic  results  are  left  as  exercises  for  the  reader.  The  first  generalizes 
the  fact  that  transposing  a  transposed  matrix  reverts  to  the  original. 

Proposition  9.5.  The  adjoint  of  the  adjoint  is  the  original  operator :  L  =  (L*)*. 

The  second  generalizes  the  fact  that  the  transpose  of  the  product  of  two  matrices  is 
the  product  of  the  transposes,  but  in  the  reverse  order. 


Proposition  9.6.  If  L:U  V  and  M:  V  W  are  linear  operators  on  inner  product 
spaces ,  with  L*:V  — U  and  M*:VF  — )►  V  their  respective  adjoints ,  then  the  composite 
linear  operator  M  o  L:U  W  has  adjoint  (M  °  L)*  =  L*  o  M*:  W  U . 

Example  9.7.  Let  us  compute  the  adjoint  of  the  second  derivative  operator  D2  = 
D  o  D  with  respect  to  the  standard  L2  inner  products  on  both  the  domain  and  target  spaces. 
According  to  Proposition  9.6  and  equation  (9.13),  at  least  on  a  formal  level, 


(Z42)*  =  D*oD*  =  (—  D)  o  (—  D)  =  D‘ 


(9.20) 


and  hence  D2  equals  its  own  adjoint.  However,  the  validity  of  (9.13)  required  that  the 
functions  in  the  domain  and  target  spaces  of  both  ZTs  satisfy  appropriate  boundary  con¬ 
ditions.  For  example,  the  domain  of  the  first  D:U  — V  could  be  U  =  {u(x)  \  u(a)  = 
u(b )  =  0},  while  its  target  space  V  is  unconstrained;  the  second  D  could  then  map  V  to 
W  =  {w(x)  \w(a)  =  w(b )  =  0},  which  will  thus  also  require  that  u"{a)  =  u"(b)  =  0  in 
order  that  D2  =  D  o  D  map  U  to  W.  Another  option  would  be  to  impose  Neumann  condi¬ 
tions  on  the  first  14,  with  U  =  {uf (a)  =  u'(b )  =0}  and  thus  V  =  ('c(u)  =  v(b)  =0},  while 
W  remains  unconstrained.  Under  either  these  or  other  suitably  compatible  constraints, 
both  adjoint  identifications  14*  —  —D  are  valid,  thus  justifying  (9.20).  Keep  in  mind  that, 
according  to  our  earlier  remark,  the  differentiation  operators  are,  in  fact,  defined  only  on 
the  dense  subspaces  containing  sufficiently  smooth  functions. 


Higher-Dimensional  Operators 


The  most  natural  multi-dimensional  analogue  of  the  derivative  is  the  gradient  operator, 
which,  on  a  two-dimensional  space,  is  given  by 


Vu  =  grad  u 


The  gradient  V  defines  a  linear  operator  that  takes  a  scalar- valued  function  u(x,y)  to 
the  vector- valued  function  consisting  of  its  two  first-order  partial  derivatives.  Thus,  the 
domain  space  U  consists  of  scalar- valued  functions  u(x,y),  or  scalar  fields,  defined  for 
(x,y)  E  f 4,  where  the  domain  9  C  M2  is  assumed  to  be  both  bounded  and  connected, 


346 


9  A  General  Framework  for  Linear  Partial  Differential  Equations 


and  with  a  nice  boundary  <9D.  (Similar  considerations  apply  to  three-  and  even  higher¬ 
dimensional  problems.)  The  target  space  V  consists  of  vector-valued  functions,  or  vector 
fields ,  v(x,  y)  =  (v1(x,  y),v2(x,  y ) )  defined  on  O.  As  in  the  preceding  subsection,  the  gra¬ 
dient  operator  V:  U  — >  V  is  well  defined  only  on  the  dense  subspace  C1(D)  C  U  consisting 
of  continuously  differentiable  scalar  fields. 

In  accordance  with  the  general  Definition  9.1,  the  adjoint  of  the  gradient  must  go  in 
the  reverse  direction, 

V*:  V  — ►  U, 

mapping  a  vector  field  v(x,y)  to  a  scalar  field  w(x,y )  =  V*v.  The  defining  equation  (9.2) 
for  the  adjoint,  namely 

((V«,v))  =  (u,  V*v),  (9.21) 

relies  on  the  choice  of  inner  products  on  the  two  vector  spaces.  Let  us  start  with  the  L2 
inner  product  between  scalar  fields: 

(u,u)=  //  u(x ,  y)  u(x ,  y)  dx  dy .  (9.22) 

J  Jn 

Similarly,  the  L2  inner  product  between  vector  fields  defined  on  Q  is  obtained  by  integrating 
their  usual  dot  product: 


v ,  v  = 


v(x,y)  •  v(x,y)dxdy=  //  [v1(x,y)v1(x,y)  +  v2(x,y)v2(x,y)]dxdy 

J  Jn 


(9.23) 

The  adjoint  identity  (9.21)  is  supposed  to  hold  for  all  appropriate  scalar  fields  u  and  vector 
fields  v.  For  the  L2  inner  products  (9.22,  23),  the  two  sides  of  the  identity  read 


(( Vu  ,  v )) 


Vu  •  v  dx  dy 


dx  dy , 


(n,V*v)=  /  /  uV*vdxdy. 

J  Jn 

Thus,  to  compare  these  two  double  integrals,  we  must  somehow  remove  the  derivatives 
from  the  scalar  held  u.  As  in  the  one-dimensional  computation  (9.8),  the  mechanism  is  an 
integration  by  parts  formula  for  double  integrals: 


/  /  \/u-vdxdy=  (D  u(v-n)ds  —  //  u  (V  •  v)  dx  dy , 


(9.24) 


j  Jn  J  on  j  Jn 

which  was  already  noted  in  (6.83).  The  left-hand  side  is  just  ((Vw,v)).  If  the  boundary 
line  integral  vanishes, 


( p  u  (v  •  n)  ds  =  0, 

Jon 

then  the  right-hand  side  of  formula  (9.24)  reduces  to 


(9.25) 


u  (V-  v)  dx  dy 


( w  ,  V  •  v )  =  ( n  ,  —  V  •  v ). 


Therefore,  subject  to  the  boundary  constraint  (9.25),  we  deduce  the  L2  inner  product 
identity 


((  V  M  ,  V  ))  =  (u,  —  V  •  V  )  , 


(9.26) 
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which  implies  that  the  L2  adjoint  of  the  gradient  operator  is  minus  the  divergence  operator : 


V*v  =  —  V  •  v.  (9.27) 

The  vanishing  of  the  boundary  integral  (9.25)  will  be  ensured  by  the  imposition  of 
suitable  homogeneous  boundary  conditions  on  the  scalar  held  u  and/or  the  vector  held  v. 
Clearly  the  line  integral  will  vanish  if  either  n  =  0orv-n  =  0at  each  point  on  the  bound¬ 
ary.  These  possibilities  lead  immediately  to  the  three  principal  types  of  (homogeneous) 
boundary  conditions.  The  hrst  are  the  Dirichlet  boundary  conditions ,  which  require 

u  =  0  on  9D.  (9.28) 

Alternatively,  we  can  set 

v  •  n  =  0  on  90,  (9.29) 

which  requires  that  v  be  everywhere  tangent  to  the  boundary.  Since  V  must  map  the 
scalar  held  u  E  U  to  an  admissible  vector  held  v  =  X7u  E  V,  the  boundary  condition  (9.29) 
requires  that  u  satisfy  the  homogeneous  Neumann  boundary  conditions 

du 

— —  =  X7u  •  n  =  0  on  90.  (9.30) 

9n 

One  can  evidently  also  mix  the  boundary  conditions,  imposing  Dirichlet  conditions  on  part 
of  the  boundary  and  Neumann  conditions  on  the  complementary  part: 

9?/ 

u  =  0  on  D  C  90,  v  •  n  =  — —  =0  on  N  =  90  \  D,  (9.31) 

9n 

with  neither  D  nor  N  empty. 

More  generally,  when  modeling  dehections  of  nonuniform  membranes,  heat  how  through 
heterogeneous  media,  and  similar  physical  equilibria,  we  replace  the  L2  inner  product  be¬ 
tween  scalar  and  vector  helds  (9.23)  by  suitably  weighted  versions^ 


(u  ,u) 
«v,v» 


/  /  u{x ,  y)  u(x ,  y)  p{x ,  y)  dx  dy . 

J  Jq 


(9.32) 


n 


V1  (x,  y )  vt(x,  y)  nx{x,  y)  +  v2(x,  y)  v2(x,  y)  k2(x,  y)  ]  dx  dy, 


in  which  p(x,  y),  ^x(x,  y),  n2(x1  y)  >  0  are  strictly  positive  functions  for  (x,  y)  E  O.  In  appli¬ 
cations,  p  represents  a  density,  while  represent  stiffnesses  or  thermal  conductivities. 

To  compute  the  weighted  adjoint  of  the  gradient  operator,  we  apply  a  similar  integration 
by  parts  argument  based  on  (6.83): 


Vw,v))  = 


n 


K>lvl 


du 

dx 


+  u2v2 


du 

dy 


dx  dy 


(9.33) 


u 


an 


(-^  +  nlvldy)  -  If 


u 


u 


n 


1  /  djK^V  d(K2v2) 

p  \  dx  dy 


Q  \  dx 
p  dx  dy. 


d(Kivi)  ,  9(k 2^2) 

dy 


+ 


dx  dy 


t  Exercise  9.2.14  treats  an  even  more  general  pair  of  inner  products. 
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provided  the  boundary  integral  vanishes.  Equating  the  left-hand  side  to 


=  //o“ 

we  deduce  that  the  adjoint  of  the  gradient  operator  with  respect  to  the  weighted  inner 
products  (9.32)  is  minus  the  “weighted  divergence  operator”: 


(V  v)  p  dx  dy , 


<«,V*v> 


djKjVi)  6{k2V2) 
dx  dy 


k,1  dvx 
p  dx 


dV2 

P  dy 


1  dtv1 
p  dx  Vl 


1  dhin 

~  a 

p  dy 


(9.34) 


The  vanishing  of  the  boundary  integral, 


u 


(- 


K 2V2 


dx  + 


hi  ivi 


u  v  •  n  ds, 


where 


V=  (K1VA 

\^2V2  A 


is  ensured  if  either  w  =  0orv-n  =  0on  dtt.  The  former  is  the  usual  homogeneous  Dirichlet 
condition,  but  the  latter  is  a  “weighted”  version  of  the  homogeneous  Neumann  boundary 

condition,  requiring  that  \7u  •  n  =  0  on  the  boundary,  where  \7u  =  ^2uy )  repre¬ 

sents  a  “weighted  normal  flux  vector” . 


Example  9.8.  Let  us  compute  the  adjoint  of  the  second-order  Laplacian  operator 
A  =  d2 /dx2  Ad2 /dy2  with  respect  to  the  L2  inner  products  on  both  its  domain  and  target 
spaces.  The  computation  is  a  simple  consequence  of  the  double  integral  identity  (6.88), 
which  we  rewrite  as 


( A  u ,  v ) 


v  A  u  dx  dy 


du 

dn 


ds  + 


u  Av  dx  dy 


( u ,  Av). 


Thus,  provided  the  boundary  integral  vanishes,  we  can  conclude  that  the  Laplacian  equals 
its  own  adjoint:  A*  =  A.  This  is  assured  when  udv/dn  =  vdu/dn  at  each  point  in  <9fL 
For  example,  the  adjoint  computation  is  valid  if  either  u  —  v  —  0  or  du/dn  =  dv/dn  =  0 
at  every  point  of  the  boundary  of  the  domain.  Keep  in  mind  that  if  we  require  v  =  0  on 
some  or  all  of  dfl,  then  this  imposes  the  condition  A u  =  0  there  in  order  that  A  map  u  to 
an  admissible  u;  similar  considerations  apply  when  dv/dn  =  0. 


Exercises 


9.1.1.  Choose  one  from  the  following  list  of  inner  products  on  R2.  Then  find  the  adjoint  of 
1  2  \ 

A  =  (  1  ^  J  when  your  inner  product  is  used  on  both  its  domain  and  target  space. 


(a)  The  Euclidean  dot  product;  (b)  the  weighted  inner  product  (v,w)  =  2v1  w1  +  3v2w2; 

rji 

(c)  the  inner  product  ( v ,  w)  =  v  Cw  defined  by  the  symmetric  positive  definite  matrix 


9.1.2.  From  the  list  in  Exercise  9.1.1,  choose  a  different  inner  product  on  the  domain  and  the 
target  space,  and  then  determine  the  adjoint  of  the  matrix  A. 
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9.1.3.  Choose  one  from  the  following  list  of  inner  products  on  R  for  both  the  domain  and  tar- 

/  1  1  0\ 

get  space,  and  find  the  adjoint  of  A  =  —1  0  1 

o  -1  2J 

Q 

R  ;  (b)  the  weighted  inner  product  (v,w)  : 


.  (a)  The  Euclidean  dot  product  on 
viwi  +  2y2w2  +  (c)  the  inner  prod- 


m 

uct  ( v  ,  w  )  =  v  C  w  defined  by  the  symmetric  positive  definite  matrix  C 


(  2 

1 

0\ 

1 

2 

1 

\o 

1 

2  ) 

9.1.4.  From  the  list  in  Exercise  9.1.3,  choose  different  inner  products  on  the  domain  and  target 
space,  and  then  compute  the  adjoint  of  the  matrix  A. 

9.1.5.  Choose  an  inner  product  on  R  from  the  list  in  Exercise  9.1.1  and  an  inner  product  on 

1  3\ 


•J 

R  from  the  list  in  Exercise  9.1.3,  and  then  compute  the  adjoint  of  A 


0  2 
V-i  i ) 


m 

^  9.1.6.  (a)  Let  C  be  an  m  x  n  matrix.  Suppose  u  Cv  =  0  for  all  u  £  Rm  and  v  £  Rn.  Prove 
that  C  —  O  must  be  the  zero  matrix,  (b)  Let  A,  B  be  m  x  n  matrices  such  that 

uT Av  =  u tBv  for  all  u  £  Rm  and  v  £  Rn.  Prove  that  A  =  B.  (c)  Find  an  n  x  n  matrix 
C  /  O  such  that  u T C  u  =  0  for  all  u  £  Rn. 

9.1.7.  Let  U  =  C°[0, 1].  Find  the  adjoint  I  *  of  the  identity  operator  I :  U  U  under  the 
weighted  inner  products  (9.16). 

9.1.8.  Compute  the  adjoint  of  the  derivative  operator  v  =  D[u\  =  u  under  the  weighted  inner 

r  1  x  C 1 

products  ( u  ,  u )  =  ^  ex  u(pc)  u(x)  dx,  ((v  ,v))  =  J  (1  +  x)  v(x)  v(x)  dx.  Clearly  state  any 
boundary  conditions  that  you  are  imposing. 

9.1.9.  Let  L[u]  =  xu' (pc)  +  u(x)  and  0  <  a  <  x  <  b.  When  subject  to  homogeneous  Dirichlet 
boundary  conditions  u(a)  =  u(b)  =  0,  determine  the  adjoint  L*[v]  with  respect  to 

(a)  the  L2  inner  products  (9.7);  (b)  the  weighted  inner  products  (9.16). 


9.1.10.  Consider  the  linear  operator  L[u]  = 


u 

u 


that  maps  u(x)  £  C  to  the  vector-valued 


function  whose  components  consist  of  the  function  and  its  first  derivative.  Imposing  the 
boundary  conditions  a(0)  =  u(  1),  compute  the  adjoint  L*  with  respect  to  the  L2  inner 
products  on  both  the  domain  and  target  spaces. 

9.1.11.  True  or  false:  The  adjoint  of  the  divergence  operator  V  •  v  with  respect  to  the  L2  inner 
products  (9.22,  23)  is  minus  the  gradient  operator:  (V  •  )*u  =  —  Va.  If  true,  what  boundary 
conditions  do  you  need  to  assume?  If  false,  what  is  the  adjoint? 

9.1.12.  Find  the  adjoint  of  the  two-dimensional  curl  operator  V  x  v,  as  defined  in  (6.73), 

with  respect  to  the  L2  inner  products  (9.22,  23).  Carefully  state  any  required  boundary 
conditions. 

§  9.1.13.  Prove  that  (a)  the  adjoint  of  a  linear  operator  is  also  a  linear  operator; 

(b)  the  adjoint  is  unique. 

0  9.1.14.  Let  L,  M:  U  — >•  V  be  linear  operators  on  the  same  inner  product  spaces.  Prove  that 

(a)  (L  +  Mf  =  L*  +  M*,  (b)  (cL)*  =  cL*  for  c  £  R. 

9.1.15.  Prove  Proposition  9.5. 

^  9.1.16.  Prove  Proposition  9.6. 

9.1.17.  True  or  false:  If  L:  U  — U  is  invertible,  then  (L-1)*  =  (L*)-1. 
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The  Fredholm  Alternative 


Given  a  linear  operator  L:U  -E  V  between  inner  product  spaces  U,  V,  a  fundamental 
problem  is  to  solve  the  associated  inhomogeneous  linear  system 


L[u]=f 


(9.35) 


for  various  forcing  functions  /  E  V.  In  finite  dimensions,  this  reduces  to  a  linear  algebraic 
system,  A  u  =  f,  defined  by  a  coefficient  matrix  A.  For  the  linear  ordinary  and  partial 
differential  operators  of  interest  to  us,  (9.35)  represents  a  linear  boundary  value  problem. 
In  general,  an  inhomogeneous  linear  system  will  not  be  solvable  unless  its  right-hand  side 
satisfies  certain  constraints,  ensuring  that  /  belongs  to  the  range  of  L.  These  conditions 
can  be  readily  characterized  using  the  adjoint  operator  via  the  so-called  Fredholm  Alter¬ 
native,  named  after  the  early-twentieth-century  Swedish  mathematician  Ivar  Fredholm. 
Fredholm’s  primary  interest  was  in  solving  linear  integral  equations,  but  his  solvability  cri¬ 
terion  was  then  recognized  to  be  a  completely  general  property  of  linear  systems,  including 
linear  algebraic  systems,  linear  differential  equations,  linear  boundary  value  problems,  and 
so  on. 

Recall  that  the  kernel  of  a  linear  operator  L  is  the  set  of  solutions  to  the  homogeneous 
linear  system  L[u]  =0. 

Definition  9.9.  The  cokernel  of  a  linear  operator  L:U  -E  V  between  inner  product 
spaces  is  defined  as  the  kernel  of  its  adjoint: 


coker  L  =  ker  L*  =  {uGk  L*[T]=0}. 
We  can  now  state  and  prove  the  Fredholm  Alternative. 


(9.36) 


Theorem  9.10.  If  the  linear  system  L[u]  =  f  has  a  solution,  then  the  right-hand 
side  must  be  orthogonal  to  the  cokernel  of  L,  i.e., 


v 


/))  =  0  for  all  v  E  coker  L. 


(9.37) 


Proof :  If  L[u]  =  /,  then,  given  v  E  coker  L,  the  adjoint  equation  (9.2)  implies 


VJ))  =  i(v,L[u] ))  =  (L*[v],u)  =  0, 


since  L*[vl  =  0  by  the  definition  of  the  cokernel. 


Q.E.D. 


Remark :  In  practice,  one  needs  to  check  the  orthogonality  constraints  (9.37)  only 
when  v  runs  through  a  basis  of  the  cokernel.  In  particular,  if  the  only  solution  to  the 
homogeneous  adjoint  system  L*[v]  =  0  is  the  trivial  solution  v  =  0,  then  there  are  no 
constraints,  and  we  expect  that  the  inhomogeneous  linear  system  (9.35)  can  be  solved 
for  any  “reasonable”  forcing  function  /.  In  finite  dimensions,  this  is  certainly  the  case, 
[89].  For  boundary  value  problems  defined  by  linear  differential  operators,  one  needs  to 
determine  what  “reasonable”  means,  and  then  prove  an  appropriate  existence  theorem. 
Although  valid  for  all  of  the  boundary  value  problems  presented  here,  when  subject  to 
continuous  or  even  piecewise  continuous  forcing  functions  /,  rigorous  proofs  of  the  existence 
of  solutions  for  partial  differential  equations  involve  the  advanced  mathematical  machinery 
of  functional  analysis  —  see,  e.g.,  [38,44,61,99]  —  and  he  beyond  the  scope  of  this 
introductory  text. 


9.1  Adjoint  s 
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Example  9.11.  Consider  the  linear  algebraic  system 

u1  —  U3  =  /1?  u2  —  2^3  =  /2,  111  ~  ^U2  +  ^U3  =  fzm  (9.38) 

Using  Gaussian  Elimination  (or  by  inspection),  one  easily  sees  that  (9.38)  admits  a  solution 
if  and  only  if  the  compatibility  condition 


—  fi  +  2/2  +  fs  —  0 


(9.39) 


holds.  Moreover,  when  this  occurs,  a  solution  exists  but 
the  Fredholm  Alternative,  we  write  the  system  in  matrix 
represents  multiplication  by  the  coefficient  matrix 


is  not  unique.  To  connect  this  to 
form  L[u]  =  f,  where  L[ u]  =  A u 


Using  the  dot  product  on  M3, 
the  transposed  matrix 


the  adjoint  linear  function 


L 


* 


v 


ATv  is  represented  by 


Therefore,  the  cokernel  is  found  by  solving  the  homogeneous  adjoint  linear  system  AT y  =  0, 
i.e., 

vi  +  ^3  =  0,  v2  —  2v3  =  0,  —  —  2  v2  +  3  ^3  =  0, 

whose  solutions  consist  of  all  scalar  multiples  of  v  =  (  —  1,  2, 1  )T.  We  recognize  the  com¬ 
patibility  condition  (9.39)  as  requiring  that  the  right-hand  side  be  orthogonal  (under  the 
dot  product)  to  the  cokernel  basis  vector, 


v*f  =  -/i  +  2/2  +  /3  =  0, 

in  accordance  with  the  Fredholm  Alternative  constraint  (9.37). 
Example  9.12.  Let  us  solve  the  boundary  value  problem 


un  =  /(#),  ?/(0)  =  0,  u'(£)  =  0,  (9.40) 

modeling  the  displacement,  under  an  external  force,  of  a  uniform  elastic  bar  of  length  £ 
both  of  whose  ends  are  free.  Solving  the  differential  equation  by  direct  integration,  we  find 
that 

(I 

where  the  constants  a,  b  are  to  be  determined  by  the  boundary  conditions.  Since 


u(x)  =  ax  +  b  +  / 

Jo 


f(z ) dzj  dy 


the  boundary  condition  ffi(0) 
requires 


nX 

u'(x)=a-\-  /  f(z)dz, 

Jo 

=  0  implies  that  a  —  0.  The  second  boundary  condition 


u\£)  =  f  f{x)  dx  =  0. 


0 


(9.41) 
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If  this  fails,  then  the  boundary  value  problem  has  no  solution.  On  the  other  hand,  if 
the  forcing  function  f(x)  satisfies  the  constraint  (9.41),  then  the  resulting  solution  of  the 
boundary  value  problem  has  the  form 

u(x)  =  b  +  J  IJ  f(z)dz 

where  the  constant  b  is  arbitrary.  Thus,  when  it  exists,  the  solution  to  the  boundary  value 
problem  is  not  unique.  The  constant  b  solves  the  corresponding  homogeneous  problem, 
and  represents  a  rigid  translation  of  the  entire  bar  by  a  distance  b. 

The  solvability  constraint  (9.41)  follows  from  the  Fredholm  Alternative.  Indeed,  ac¬ 
cording  to  Example  9.7,  under  the  L2  inner  products  and  the  given  boundary  conditions, 
(T>2)*  =  L>2,  and  hence  the  adjoint  system  is  the  unforced  homogeneous  boundary  value 
problem 

v"  —  0,  i/(0)  =  0,  */(£)=  0, 

with  solution  v(x)  —  c  for  any  constant  c.  Thus,  the  cokernel  consists  of  all  scalar  multiples 
of  the  constant  function  v^(x)  =  1.  The  Fredholm  Alternative  requires  that  the  forcing 
function  in  the  original  boundary  value  problem  be  orthogonal  to  the  cokernel  functions, 
and  so 

(1,/)  =  [  f(x)dx  =  0, 

Jo 

which  is  precisely  the  condition  (9.41)  required  for  existence  of  a  (nonunique)  equilibrium 
solution. 

Example  9.13.  Consider  the  homogeneous  Neumann  boundary  value  problem  for 
the  Poisson  equation  on  a  bounded  domain  O  C  IR2,  namely, 

c)v 

—  A  u  — f  in  O,  — —  =0  on  <9D.  (9.43) 

an 

According  to  Example  9.8,  the  Laplacian  is  self-adjoint  under  the  L2  inner  product  and 
the  prescribed  boundary  conditions:  A*  =  A.  Thus,  the  homogeneous  adjoint  system  is 
merely 

dv 

—  Av  =  0  in  Q,  — —  =  0  on  dVt. 

on 

Theorem  6.15  tells  us  that  the  only  solutions  to  the  adjoint  problem  are  the  constant 
functions,  v(x,y)  =  c.  Thus,  a  basis  for  the  cokernel  consists  of  the  function  v(x,y)  =  1, 
and  so  the  Fredholm  Alternative  requires  that  the  forcing  function  in  (9.43)  satisfy 


dy , 


(9.42) 


( 1 !  /)  —  ff  f(x,y)dxdy  =  0. 

J  Jn 


(9.44) 


reproducing  our  earlier  constraint  (6.90)  for  the  homogeneous  Neumann  case. 


Exercises 

9.1.18.  Use  the  Fredholm  Alternative  to  determine  whether  the  following  linear  systems  are 
compatible.  When  compatible,  write  down  the  general  solution. 


9.2  Self-Adjoint  and  Positive  Definite  Linear  Functions 
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.  .  2x-4 y  =  -2, 

^  -x  +  2y  =  3,  (b) 

2x  +  3  y 

6x  —  3y  -\-  9 z  =  6,  3x  -\-7y 

2x-y  +  3z  =  2,  x  +  Ay 

-X  +  y 

2x1  —  3.x2  —  .x3  =  —1, 

2x1  +  3x2  —  x4  =  —1, 

(d)  Sx1  —  x2  =  1, 

(e)  3xx  +  2x3  —  x4  =  0, 

4xx  x<2  T  x3  =  2, 

X1  ~  x2  T  xs  ~  1- 

9.1.19.  Use  the  Fredholm  Alternative  to  find  the  compatibility  conditions  for  the  following  sys¬ 
tems  of  linear  equations. 

(a)  2x  +  y  —  a,  x-\-Ay  —  b,  —  3x  +  2 y  —  c; 

(b)  x  +  2y  +  3z  =  a,  —  x -\- by  —  2z  =  b,  2x  —  3y-\-bz  =  c; 

(c)  x1  +  2x2  +  3x3  =  b1,  x2  +  2x3  =  62,  3xx  +  5x2  +  7x3  =  &3,  —  2^  +  x2  +  4x3  =  b4; 

( d )  x  —  3y  +  2z  +  ic  =  a,  4x  — 2y  +  2z  +  3ic  =  6,  5x  —  5y-h4z-h4w  =  c,  2x  +  4?/  —  2z  +  ic  =  d. 

9.1.20.  Suppose  A  is  a  symmetric  matrix.  Show  that  the  linear  system  Ax  =  b  has  a  solution 
if  and  only  if  b  is  orthogonal  to  ker  A. 

9.1.21.  Use  the  Fredholm  Alternative  to  determine  whether  or  not  there  exists  a  solution  to  the 

following  boundary  value  problem:  xu"  +  u  =  1  —  u  ( 1)  =  u  (2)  =  0.  If  so,  write  down 
all  solutions. 

9.1.22.  Analyze  the  periodic  boundary  value  problem 

—  u"  =  fix),  u{ 0)  =  u{ 2 7f) ,  i/(0)  =  u  ( 2 7r) , 

along  the  same  lines  as  in  Example  9.12.  Characterize  the  forcing  functions  for  which  the 
problem  has  a  solution.  Explain  why  the  constraints,  if  any,  are  in  accordance  with  the 
Fredholm  Alternative.  Write  down  a  forcing  function  f(x)  that  satisfies  all  your  constraints, 
and  then  find  all  corresponding  solutions. 

9.1.23.  Answer  Exercise  9.1.22  for  the  boundary  value  problems: 

(a)  u""  =  /(#),  u"  { 0)  =  u'"( 0)  =  0,  u" ( 1)  =  u'"(  1)  =  0; 

(b)  u""  =  f(x),  u" { 0)  =  u" (0)  =  0,  u(  1)  =  ^(l)  =  0. 

C  9.1.24.  Let  A  be  a  real  parameter,  (a)  For  which  values  of  A  does  the  boundary  value  problem 

u  +  Xu  =  h{x),  a(0)  =  0,  a(l)  =  0,  have  a  unique  solution?  (b)  Construct  the  Green’s 
function  for  all  such  A.  (c)  In  the  nonunique  cases,  use  the  Fredholm  Alternative  to  find 
conditions  on  the  forcing  function  h{x)  that  are  required  for  the  existence  of  a  solution. 

9.1.25.  Let  D  C  M2  be  a  bounded,  connected  domain.  Using  the  L2  inner  products  (9.22,  23) 
on  scalar  and  vector  fields,  write  out  the  Fredholm  Alternative  constraints  for  the  solvabil¬ 
ity  of  the  boundary  value  problem  V  •  v  =  /  in  D,  subject  to  the  homogeneous  boundary 
conditions  v  •  n  =  0  on  <9D. 

o  o 

9.1.26.  Let  D  c  M.  be  a  bounded  simply  connected  domain.  Using  the  L  inner  products 
(9.22,  23)  on  scalar  and  vector  fields  on  a  domain  D  C  M2,  write  out  the  Fredholm  Alterna¬ 
tive  constraints  for  the  solvability  of  the  boundary  value  problem  Vu  =  f  in  D,  subject  to 
the  homogeneous  boundary  conditions  u  =  0  on  <9D. 


9.2  Self-Adjoint  and  Positive  Definite  Linear  Functions 

In  finite-dimensional  linear  algebra,  there  are  two  particularly  important  classes  of 
matrices:  symmetric,  equal  to  their  own  transpose,  and  positive  definite,  as  prescribed 
by  Definition  B.12.  The  goal  of  this  section  is  to  adapt  both  concepts  to  more  general 
linear  operators,  paying  particular  attention  to  the  case  of  linear  differential  operators. 
The  resulting  classes  of  self-adjoint  and  positive  (semi-) definite  differential  operators  are 
ubiquitous  in  applications  of  ordinary  and  partial  differential  equations. 
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Self- Adjointness 

Throughout  this  section,  U  will  be  a  fixed  inner  product  space.  We  have  already  seen  that 
the  transpose  of  a  matrix  is  a  very  special  case  of  the  adjoint  operation.  Thus,  the  natural 
analogue  of  a  symmetric  matrix  is  a  linear  operator  that  equals  its  own  adjoint. 

Definition  9.14.  A  linear  operator  S:U  U  is  called  self-adjoint  if  S*  =  S. 


Thus,  according  to  (9.2),  S  is  self-adjoint  if  and  only  if 


(  S[u] ,  u )  =  ( u  ,  S[u 


for  all 


u.u  £  U. 


(9.45) 


Example  9.15.  In  the  finite-dimensional  case,  a  linear  function  S:Mn  — IRn  is 
realized  by  matrix  multiplication:  S[ u]  =  K u,  where  IF  is  a  square  matrix  of  size  n  x  n. 
If  we  use  the  ordinary  dot  product  on  IRn,  then,  according  to  Example  9.2,  the  adjoint 
function  S*:  Mn  Mn  is  given  by  multiplication  by  the  transposed  matrix:  S*[u]  =  Kt  u. 
Thus,  a  linear  function  is  self-adjoint  with  respect  to  the  dot  product  if  and  only  if  it  is 
represented  by  a  symmetric  matrix:  KT  =  K. 

On  the  other  hand,  if  we  adopt  the  weighted  inner  product  ( u ,  u)  =  uTCu  provided 
by  the  symmetric  positive  definite  matrix  C  >  o,  then,  according  to  (9.6),  the  adjoint 
function  S'*  has  matrix  representative  C~1KTC1  and  hence  S  is  self-adjoint  under  the 
weighted  inner  product  if  and  only  if  the  matrix  K  satisfies  K  =  C~1KTC. 

Example  9.16.  In  Example  9.7,  we  argued  that  the  second-order  derivative  operator 
S  =  D 2  is  self-adjoint  with  respect  to  the  L2  inner  product,  when  subject  to  suitable  homo¬ 
geneous  boundary  conditions.  A  direct  verification  of  this  result  is  instructive.  According 
to  the  general  adjoint  equation  (9.2),  we  need  to  equate 


fb 

/  S[u]  u  dx  =  (  S[u] ,  u )  =  ( u  ,  S'* 

J  a 


U 


=  I  uS 

a 


*  r 


U 


dx. 


(9.46) 


As  before,  the  computation  relies  on  (in  this  case  two)  integration  by  parts: 


( S[u]  ,w)  = 


a 


d2u 
dx 2 


udx  = 


du 

dx 


u 


x  —  a 


'b  du  du 
dx  dx 


dx 


du 

dx 


u  —  u 


du 

dx 


+ 


x  =  a 


a 


d2u 

u  —  ~  dx. 
dxz 


Comparing  with  (9.46),  we  conclude  that  S'*  =  D2  =  S',  provided  the  boundary  terms 
vanish: 


du  _  du 
u  —  u 


dx 


dx 


u'(b)  u(b)  —  u(b)  u'(b)  —  u'(a)  u(a)  —  u[a)  u\a)  =  0. 


x  —  a 


(9.47) 

This  requires  that  we  impose  suitable  boundary  conditions  at  the  endpoints,  which  will 
serve  to  characterize  the  underlying  vector  space  U  on  which  S  =  D2  acts.  One  possibility 
is  to  set  U  =  {u(a)  =  u(b)  =0},  thereby  imposing  homogeneous  Dirichlet  boundary  condi¬ 
tions.  Since  u  E  U  also,  u(a)  =  u{b)  =  0,  and  hence  (9.47)  holds,  proving  self- adjoint  ness. 
Alternatively,  one  can  impose  homogeneous  Neumann,  mixed,  or  periodic  boundary  con¬ 
ditions  to  specify  the  space  U  and  similarly  establish  self- adjoint  ness  of  S'  =  D2. 


9.2  Self-Adjoint  and  Positive  Definite  Linear  Functions 
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Positive  Definiteness 


Let  us  turn  to  the  characterization  of  positive  definite,  and,  slightly  less  stringently,  positive 
semi-definite  linear  operators.  These  serve  to  extend  the  notions  of  positive  definite  and 
semi-definite  matrices  to  linear  differential  operators  defining  boundary  value  problems. 

Definition  9.17.  A  linear  operator  S:U  — ?►  U  on  an  inner  product  space  is  called 
positive  definite ,  written  S  >  0,  if 


(u,5'[u])>0  for  all  0. 

The  operator  S  is  positive  semi- definite,  written  S  >  0,  if 


(9.48) 


u  ,  S[u] )  >  0  for  all 


u. 


(9.49) 


Observe  that,  on  the  finite-dimensional  space  U  =  Mn  equipped  with  the  dot  product, 
the  linear  function  S[ u]  —  K u  is  positive  (semi-)dehnite  if  and  only  if  K  is  a  positive 
(semi-)dehnite  matrix,  as  per  Definition  B.12.  (However,  changing  the  inner  product  on  IRn 
will  result  in  an  alternative  notion  of  positive  definiteness  for  the  matrix  K;  see  Exercise 
9.2.5.)  In  the  infinite-dimensional  situations  involving  differential  operators,  the  domain 
of  the  operator  may  be  only  a  dense  subspace  of  the  full  inner  product  space  {/,  and 
one  imposes  the  positivity  condition  (9.48)  or  (9.49)  only  on  those  functions  u  lying  in 
the  domain  of  S.  Fortunately,  this  technicality  has  no  serious  effect  on  the  subsequent 
development. 

Example  9.18.  Consider  the  operator  S  =  —  D2  acting  on  the  space  U  consisting  of 
all  C2  functions  defined  on  a  bounded  interval  [a,  b]  and  subject  to  homogeneous  Dirichlet 
boundary  conditions  u(a)  =  u(b)  =  0.  To  establish  positive  definiteness,  we  evaluate 


(S[u],u)  =  J  ( 


d2 


u 


dx 2 


u  I  dx  =  — 


du 

dx 


u 


+ 


x  —  a 


dx  = 


where  we  integrated  by  parts  and  then  used  the  boundary  conditions  to  eliminate  the 
boundary  terms.  The  final  expression  is  clearly  >0,  and  hence  S  is  at  least  positive  semi- 
definite.  Moreover,  since  u'(x)  is  continuous,  the  only  way  the  final  integral  could  vanish  is  if 
u'(x)  =  0,  which  means  u(x)  =  c  is  constant.  However,  the  only  constant  function  satisfying 
the  homogeneous  Dirichlet  boundary  conditions  is  u{x)  =  0.  Thus,  (S'fu]  ,u)  >  0  for  all 
0  fi2  u  £  [/,  which  implies  S  >  0.  A  similar  argument  implies  positive  definiteness  when  the 
functions  are  subject  to  the  mixed  boundary  conditions  u(a)  =  u'(b)  =  0.  On  the  other 
hand,  any  constant  function  satisfies  the  Neumann  boundary  conditions  u\a)  =  u'(b)  =  0, 
and  hence  in  this  case  S  >  0  is  only  positive  semi-definite. 

Proposition  9.19.  If  S  >  0,  then  ker  S  =  {0}.  As  a  consequence ,  a  positive  definite 
linear  system  S'fu]  =  /  with  f  in  the  range  of  S,  so  f  E  rng  S ,  must  have  a  unique  solution. 

Proof :  If  S[u]  =  0,  then  ( u ,  S[u] )  =  0,  which,  according  to  (9.48),  is  possible  only  if 
u  —  0.  The  second  statement  follows  directly  from  Theorem  1.6.  Q.E.D. 

Thus,  in  the  finite-dimensional  case,  positive  definiteness  implies  that  the  coefficient 
matrix  of  S[u]  =  Ku  is  nonsingular,  and  hence  existence  of  a  solution  is  automatic.  In 
the  infinite-dimensional  cases  of  boundary  value  problems,  existence  of  solutions  usually 
requires  some  further  analysis,  [63]. 
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The  most  common  means  of  producing  self-adjoint,  positive  (semi-) definite  linear  op¬ 
erators  is  provided  by  the  following  general  construction.  From  here  on,  in  order  to  dis¬ 
tinguish  the  possibly  different  norms  resulting  from  the  inner  products  on  the  domain  and 
target  spaces  of  a  linear  operator  L:U  V,  we  employ,  respectively,  the  following  double 
and  triple  bar  notation: 


u 


=  ( u ,u  , , 


u  eU, 


V 


{{V  ,  V 


V  e  V. 


(9.50) 


Theorem  9.20.  Let  L:  U  -E  V  be  a  linear  map  between  inner  product  spaces  with 
adjoint  L* :  V  -E  [/.  Then  the  composite  map 


S  =  L*oL:  U  - 

is  always  self-adjoint ,  S  =  S*,  and  positive  semi-dehnite ,  S  >  0,  with  kerS  =  kerL. 
Moreover,  S  >  0  is  positive  dehnite  if  and  only  if  kerL  =  {0}. 

Proof :  First,  by  Propositions  9.5  and  9.6, 


E  U 


S*  =  (L*oL)*  =L*o(LT  =L*oL  =  S, 


* 


* 


* 


*\* 


* 


proving  self- adjoint  ness.  Furthermore. 


( u  ,  S[u] )  =  (u  ,  L*[L[u]])  =  (( L[u] ,  L[u]))  =  |||  L[n]  |||2  >  0 


(9.51) 


for  all  u,  proving  positive  semi-definiteness.  Moreover,  the  result  is  >  0  as  long  as  L[u]  7^  0. 
Thus,  if  kerL  =  [u\L[u]  =  0}  =  {0},  then  (u,S[u])  >  0  for  all  u  7^  0,  and  hence  S 
is  positive  definite.  Finally,  the  same  computation  proves  that  kerS'  =  kerL.  Indeed,  if 
L[u]  =  0,  then  S[u]  =  L*[L[n]]  =  L*[0]  =  0.  On  the  other  hand,  if  S[u]  =  0,  then 
0  =  ( u  ,  S[u] )  =  HI  L[u ]  |||2,  and  hence  L[u]  =  0.  Q.E.D. 


We  are  particularly  interested  in  linear  systems  that  are  based  on  the  construction  of 
Theorem  9.20,  namely 

S[u]  =  L*[L[u]\  =  /.  (9.52) 

We  will  refer  to  the  system  (9.52)  as  positive  definite  or  positive  semi-definite  according 
to  the  status  of  its  defining  operator  S.  Thus,  the  system  is  positive  definite  if  and  only 
if  ker S  =  kerL  =  {0},  i.e.,  the  only  solution  to  the  homogeneous  system  S[z]  =  0  is  the 
trivial  solution  z  =  0.  In  this  case,  the  solution  to  (9.52)  (provided  it  exists)  is  unique.  On 
the  other  hand,  if  there  are  nonzero  solutions  to  S[z]  =0,  then  (9.52)  is  only  positive  semi- 
definite,  and  does  not  admit  a  unique  solution.  Moreover,  unless  the  Fredholm  Alternative 
constraints  (9.37)  hold,  then  there  are  no  solutions.  By  Theorem  9.20,  we  can  identify 


coker  S  =  ker  S'*  =  ker  S  =  ker  L. 


(9.53) 


which  thus  implies  the  following: 


Theorem  9.21.  Let  S  =  L*  oL.  If  the  linear  system  S[w]  =  /  has  a  solution,  then 
z ,  / )  =0  for  all  z  E  kerL.  Moreover,  if  S[u]  =  f  and  S[u]  —  f  are  two  solutions  to  the 
same  linear  system,  then  u  =  u  +  z,  where  z  E  kerL  is  any  solution  to  L[z]  =  0. 

Example  9.22.  In  the  finite-dimensional  case,  any  linear  function  L:Mn  -E  ]Rm  is 
represented  by  matrix  multiplication:  L[u]  =  A u.  For  the  dot  product  on  both  the  domain 
and  target  spaces,  L*[v]  =  ATv,  and  so  the  self-adjoint  combination  S  =  L*  o  L:  IRn  -E  IRn 
is  represented  by  the  n  x  n  symmetric  matrix  K  =  ATA.  According  to  Theorem  9.20,  the 
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matrix  K  is  always  positive  semi-definite,  and  is  positive  definite  if  and  only  if  the  only 
solution  to  the  homogeneous  linear  system  A z  =  0  is  the  trivial  solution  z  =  0.  In  the 
positive  semi-definite  case,  the  Fredholm  Alternative  of  Theorem  9.21  states  that  the  linear 
system  Ku  =  f  has  a  solution  if  and  only  if  z  •  f  =  0  for  all  z  E  ker  A.  (As  noted  before, 
existence  of  solutions  in  the  finite-dimensional  case  is  not  an  issue.)  Moreover,  if  u  is  any 
solution,  so  is  u  =  u  +  z  for  any  z  E  ker  A. 

More  generally,  if  we  adopt  the  weighted  inner  products  (9.4)  on  the  domain  and 
target  spaces  represented  by  the  respective  positive  definite  matrices  M  >  0  and  C  >  0, 
then  the  adjoint  map  L*  has  matrix  representative  M_1ATC,  and  hence  S  =  L*  oL  is 
given  by  multiplication  by  the  (not  necessarily  symmetric)  n  x  n  matrix  K  =  M~1ATC  A. 
Again,  K  >  0  in  all  cases,  and  K  >  0  if  and  only  if  ker  A  =  {0}.  Now,  the  Fredholm 
Alternative  states  that  the  linear  system  K  u  =  M~1ATCAu  =  f  has  a  solution  if  and 
only  if  (z,f)  =  zTMf  =  0  for  all  z  E  ker  A.  See  [89,112]  for  applications  of  this 
construction  in  mechanics,  electrical  networks,  and  the  stability  of  structures. 


Example  9.23.  Consider  next  the  differentiation  operator  D[u]  =  v! .  According 
to  Example  9.3,  if  we  impose  suitable  homogeneous  boundary  conditions  on  the  space  of 
allowable  functions  —  Dirichlet,  Neumann,  mixed,  or  periodic  —  and  use  the  L2  inner 
products  on  both  domain  and  target  space,  then  D*[v]  =  —  v' .  Therefore,  the  self-adjoint 
operator  of  Theorem  9.20  is  given  by  S  =  T  =  -  D2 . 

According  to  Theorem  9.20,  the  resulting  boundary  value  problem 


S[u]  =  —  u"  =  f 


is  always  positive  semi-definite,  and  is  positive  definite  if  and  only  if  ker  D  =  {0},  i.e., 
the  only  function  that  satisfies  D[u]  =  u'  =  0  along  with  the  boundary  conditions  is  the 
zero  function.  Consider  first  the  Dirichlet  boundary  conditions  u(a)  =  u(b )  =  0.  On 
a  connected  interval,  u'  —  0  if  and  only  if  u  =  c  is  a  constant  function.  However,  the 
boundary  conditions  require  that  c  =  0,  and  hence  only  the  zero  function  appears  in  the 
kernel.  We  conclude  that  the  Dirichlet  boundary  value  problem  is  positive  definite,  and 
its  solution  unique.  A  similar  argument  applies  to  the  mixed  boundary  conditions,  e.g., 
u(a)  =  u'(b)  =  0,  since  the  condition  at  x  =  a  is  enough  to  ensure  that  the  constant  function 
must  be  zero.  On  the  other  hand,  any  constant  function  satisfies  the  Neumann  boundary 
conditions  u\a)  =  u'(b)  =  0,  and  hence  in  this  case,  ker  D  consists  of  all  constant  functions. 
Therefore,  the  Neumann  boundary  value  problem  is  only  positive  semi-definite.  And,  as 
we  saw,  the  solution,  when  it  exists,  is  not  unique,  since  we  can  add  any  constant  function 
to  a  solution  and  obtain  another  solution.  A  similar  argument  proves  that  the  periodic 
boundary  value  problem,  with  u(a)  =  u(b),  u'(a )  =  u'(b),  is  also  positive  semi-definite, 
with  the  same  kinds  of  existence  and  uniqueness  properties. 

More  generally,  if  we  use  weighted  inner  products  (9.16)  on  the  domain  and  target 
spaces,  then,  again  subject  to  suitable  boundary  conditions,  the  adjoint  is  given  by  (9.19), 
and  so  the  self-adjoint  boundary  value  problem  S[u]  =  D *  o  D[u]  —  f  is  based  on  the  more 
general  differential  equation 


S[u]  =  - 


p(x)  \  dx 


d  du 


=  /CO- 


(9.54) 


Such  boundary  value  problems  model  the  deformations  of  a  nonuniform  elastic  bar  with 
density  p(x)  and  stiffness  n(x),  when  subject  to  the  external  forcing  function  f(x).  Again, 
the  positive  definiteness  of  the  problem  depends  on  whether  ker  D  =  {0},  and  so  the  exact 
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same  classification  holds  as  in  the  unweighted  case:  the  Dirichlet  and  mixed  boundary 
value  problems  are  positive  definite  and  have  a  unique  solution,  whereas  the  Neumann  and 
periodic  boundary  value  problems  are  only  positive  semi-definite,  and  the  existence  of  a 
solution  requires  the  Fredholm  conditions  to  be  satisfied. 

Self-adjointness  underlies  the  symmetry  of  the  associated  Green’s  function.  As  a  func¬ 
tion  of  x,  the  Green’s  function  G^(x)  =  G(x;£)  satisfies  the  boundary  value  problem  with 
delta  function  forcing  concentrated  at  position  x  =  £: 


S^G^]  =  5^  or,  explicitly, 


j_  d_(  ,  ,0G\ 

p{x)  dx  ( ^  dx  ) 


8(x  -  0, 


(9.55) 


along  with  the  required  homogeneous  boundary  conditions.  Suppose  first  that  we  are  using 
the  L2  inner  product  on  the  interval  [a,  6],  so  that  p(x)  =  1.  Using  the  definition  of  the 
delta  function  S^(x)  =  5{x  —  £)  and  the  self-adjointness  of  S',  we  have,  for  any  a  <x,i<b, 


G(x;  0  =  Gc(z)  =  f  G^y)  5x(y)  dy  =  ( ,5X)  =  ( ,  S[GX]) 

J  a 

=  {S[Gi],Gx)  =  {5i,Gx)=  [  S^y)  Gx(y )  dy  =  Gx(£)  =  G(£;  x). 

J  a 


This  establishes^  the  symmetry  equation 


(9.56) 


=  G(£;z)  (9.57) 

for  the  Green’s  function  of  a  self-adjoint  boundary  value  problem  under  the  L2  inner  prod¬ 
uct.  This  can  be  regarded  as  the  differential  operator  version  of  the  fact  that  the  inverse 
of  a  symmetric  matrix  is  also  symmetric. 

On  the  other  hand,  if  we  adopt  a  weighted  inner  product 

(u,u)  =  /  u(y)  u(y )  p(y)  dy, 

J  a 

then  the  preceding  argument  must  be  slightly  modified: 

p{x)G{x\£)  =  p(x)Gi(x)  =  f  p(y)G^(y)8x(y)dy  =  (G^,6X)  =  (G^,S[GX]) 

J  a 

=  (S[Gi],Gx)  =  (5vGx)=  [  S^y)  Gx(y )  p{y)  dy  =  p(Q  Gx(£)  =  p(f)  Gfa  x), 

J  a 

and  so  the  Green’s  function  associated  with  a  weighted  self-adjoint  boundary  value  problem 
satisfies  a  “weighted  symmetry  condition” 


p{x)G{x\C)  =p(()G((;i). 


(9.58) 


Remark:  Equation  (9.58)  implies  that  the  modified  Green’s  function 


G(x;0  = 


p{0 


is  genuinely  symmetric:  G(x;  £)  =  G(£;  x) 


(9.59) 


t 


Symmetry  at  the  endpoints  is  a  consequence  of  continuity. 
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The  modified  Green’s  function  also  has  the  advantage  of  recasting  the  superposition  formula 
for  the  solution  to  the  boundary  value  problem  S[u]  =  f  as  the  appropriate  weighted  inner 
product: 

u(x)  =  (  G(x;£)  f(t)d£=  (  G(x;  £)  /(£)  p(£)  d£  =  <  Gx  ,  f ),  where  Gx(£)  =  G(x;  £)• 
J  a  J  a 


Two-Dimensional  Boundary  Value  Problems 


Let  us  next  apply  the  self-adjoint  formalism  to  study  boundary  value  problems  on  a 
bounded,  connected,  two-dimensional  domain  O  C  M2.  We  take  L  =  V  to  be  the  gra¬ 
dient  operator,  mapping  a  scalar  field  u  to  a  vector  held  v  =  Vw.  We  impose  a  suitable 
set  of  homogeneous  boundary  conditions,  i.e. ,  Dirichlet,  Neumann,  or  mixed.  According  to 
the  calculation  in  Section  9.1,  if  we  adopt  the  basic  L2  inner  products  (9.22,23)  between 
scalar  and  vector  fields,  then  the  adjoint  of  the  gradient  is  the  negative  of  the  divergence: 
V*v  =  —  V  •  v.  Therefore,  the  self-adjoint  combination  of  Theorem  9.20  yields 


V*oV[w 


V  •  (Vn)  =  —  An, 


where  A  is  the  Laplacian  operator.  In  this  manner,  we  are  able  to  write  the  two-dimensional 
Poisson  equation  in  self-adjoint  form 


—  An  =  —V  •  (Vn)  =  V*  °  Vn  =  /,  (9.60) 

as  always  subject  to  the  selected  boundary  conditions. 

According  to  Theorem  9.20,  —  A  =  V*  °V  is  positive  definite  if  and  only  if  the  kernel 
of  the  gradient  operator  —  restricted  to  the  appropriate  space  of  scalar  fields  —  is  trivial: 
ker  V  =  {0}.  Since  we  are  assuming  that  the  domain  12  is  connected,  Lemma  6.16  tells  us 
that  the  only  functions  that  could  show  up  in  kerV,  and  thus  prevent  positive  definite¬ 
ness,  are  the  constants.  The  boundary  conditions  will  tell  us  whether  this  occurs.  The 
only  constant  function  that  satisfies  either  homogeneous  Dirichlet  or  homogeneous  mixed 
boundary  conditions  is  the  zero  function,  and  thus,  just  as  in  the  one-dimensional  case, 
the  boundary  value  problem  for  the  Poisson  equation  subject  to  Dirichlet  or  mixed  bound¬ 
ary  conditions  is  positive  definite.  In  particular,  this  means  that  its  solution  is  uniquely 
defined.  On  the  other  hand,  any  constant  function  satisfies  the  homogeneous  Neumann 
boundary  condition  du/d n  =  0,  and  hence  such  boundary  value  problems  are  only  positive 
semi-definite.  Existence  of  a  solution  relies  on  the  Fredholm  Alternative,  as  we  discussed 
in  Example  9.13;  moreover,  when  it  exists,  the  solution  is  no  longer  unique,  because  one 
can  add  in  any  constant  without  affecting  either  the  equation  or  the  boundary  conditions. 

More  generally,  if  we  impose  weighted  inner  products  (9.32)  on  our  spaces  of  scalar  and 
vector  fields,  then,  recalling  (9.34),  the  corresponding  self-adjoint  boundary  value  problem 
takes  the  more  general  form 


V*  °  Vn  = 


1  d 

p(x,y)  dx 


1  d  (  ,  N  du\ 

p(x,  y)  dy  \K2  x,y  dy) 


f(x,y ),  (9.61) 


along  with  the  chosen  boundary  conditions  on  <912.  Again,  the  Dirichlet  and  mixed  bound¬ 
ary  value  problems  are  positive  definite,  with  unique  solutions,  while  the  (suitably  weighted) 
Neumann  problem  is  only  positive  semi-definite. 
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The  partial  differential  equation  (9.61)  arises  in  various  physical  contexts.  For  exam¬ 
ple,  consider  a  steady-state  fluid  flow  moving  in  a  domain  Od2  described  by  a  vector 
held  v.  The  how  is  called  irrotational  if  it  has  zero  curl,  V  x  v  =  0,  and  hence,  assuming 
that  D  is  simply  connected,  is  a  gradient  v  =  Vn,  where  u(x,y)  is  known  as  the  fluid 
velocity  potential.  The  constitutive  assumptions  connect  the  fluid  velocity  with  its  rate  of 
how  w  =  k;v,  where  n(x,y)  >  0  is  the  scalar  density  of  the  fluid.  Conservation  of  mass 
provides  the  hnal  equation,  namely  V  •  w  +  /  =  0,  where  f(x,y)  represents  fluid  sources 
(/  >  0)  or  sinks  (/  <  0).  Therefore,  the  basic  equilibrium  equations  take  the  form 


— V  •  (k,  Vu)  =  /,  or 


f(x,y),  (9.62) 


which  is  (9.61)  with  p  1  and  n2  — )►  n.  The  case  of  a  homogeneous  (constant  density) 
fluid  thus  reduces  to  the  Poisson  equation  (4.84),  with  /  replaced  by  f  / n. 

Symmetry  of  the  Green’s  function  for  the  Poisson  equation  and  the  more  general 
boundary  value  problems  (9.61,  62)  follows  by  an  evident  adaptation  of  the  one-dimensional 
argument  presented  above.  Details  are  left  as  Exercise  9.2.17. 


Exercises 


9.2.1.  Which  of  the  following  matrices  dehne  self-adjoint  linear  functions  S :  R2  — >  R2  relative 


to  the  dot  product?  (a) 


(h) 


(c) 


(h) 


9.2.2.  Answer  Exercise  9.2.1  for  the  inner  products 

(z)  (  u  ,  u )  =  2ux  ux  +  3u2  u2\  (zz)  (u,u) 


uTCu,  where  C 


9.2.3.  True  or  false:  Given  an  inner  product  (u,v)  on  Rn: 

(a)  The  inverse  of  a  nonsingular  self-adjoint  n  x  n  matrix  is  self-adjoint. 

(b)  The  inverse  of  a  nonsingular  positive  definite  n  x  n  matrix  is  positive  definite. 

9.2.4.  Prove  that  K  >  0  is  a  positive  definite  n  x  n  matrix  if  and  only  if  J  =  KT  +  K  is  a 
symmetric  positive  definite  matrix. 


^  9.2.5.  (a)  Prove  that  the  n  x  n  matrix  K  defines  a  self-adjoint  linear  function  on  Rn  with  re¬ 
spect  to  the  inner  product  (u,u)  =  uTCu  for  C  a  symmetric  positive  definite  matrix  if 
and  only  if  the  matrix  J  =  C  K  is  symmetric,  and  hence  defines  a  self-adjoint  linear 
function  with  respect  to  the  dot  product,  (b)  Prove  that  K  >  0  under  the  given  inner 
product  if  and  only  if  J  >  0  under  the  dot  product. 


9.2.6.  Let  D[u]  =  u  be  the  derivative  operator  acting  on  the  vector  space  of  C2  scalar  func¬ 
tions  u{x)  defined  for  0  <  x  <  1  and  satisfying  the  boundary  conditions  a(0)  =  0,  u{  1)  =  0. 

r  1 

(a)  Given  the  weighted  inner  product  (u,u)  =  /  u(x)  u(x)  ex  dx  on  both  its  domain  and 

J  U 


target  spaces,  determine  the  corresponding  adjoint  operator  D 
(b)  Let  S  =  D*  o£).  Write  down  and  solve  the  boundary  value  problem  S[u]  =  2e 


X 


9.2.7.  Let  c(x)  E  C°[a,  6]  be  a  continuous  function.  Prove  that  the  linear  multiplication 

operator  S[u]  =  c(x)u(x)  is  self-adjoint  with  respect  to  the  L2  inner  product.  What  sort  of 
boundary  conditions  need  to  be  imposed? 

9.2.8.  True  or  false:  The  Neumann  boundary  value  problem  —  u"  +  u  =  x,  u  ( 0)  =  u  (ir)  =  0, 
admits  a  unique  solution. 
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du 

9.2.9.  Prove  that  the  complex  differential  operator  L[u]  =  i  —  is  self-adjoint  with  respect  to 

dx 

the  L  Hermitian  inner  product  ( u  ,  v  )  =  /  u(x)  v(x)  dx  on  the  space  of  continuously 
differentiable  complex- valued  27r-periodic  functions:  u(x  +  27 r)  =  u(x). 

o  o 

9.2.10.  Let  L  =  D  .  Using  the  L  inner  products  on  both  the  domain  and  target  spaces, 
write  down  a  set  of  homogeneous  boundary  conditions  that  makes  L*  =  D 2 .  Then  set 
S'  =  L*  °L  =  D4.  Do  your  boundary  conditions  lead  to  a  boundary  value  problem 
S[u]  =  /  that  is  (y)  positive  definite;  (ii)  positive  semi-definite;  or  (in)  neither? 

9.2.11.  Let  fd  be  a  real  constant.  True  or  false:  The  second  derivative  operator  S[u]  =  u"  is 
self-adjoint  with  respect  to  the  L2  inner  product  on  the  space  of  functions 

[7=|  u(x)  G  C2[0, 1]  77(0)  =  0,  u  ( 1)  +  u(l)  =  0  J 

subject  to  Dirichlet  boundary  conditions  at  the  left-hand  endpoint  and  Robin  boundary 
conditions  at  the  right-hand  endpoint. 

T  9.2.12.  Let  fd  be  a  real  constant.  Consider  the  differential  operator  S[u]  =  —u"  acting  on  the 
space  of  functions 

U=|  u(x)  G  C2  [0,1]  fz(0)  =0,  u  (1)  +  P  u(l)  =  0  j 

subject  to  Dirichlet  boundary  conditions  at  the  left-hand  endpoint  and  Robin  boundary 
conditions  at  the  right-hand  endpoint.  Prove  that  S  >  0  is  positive  definite  with  respect  to 

c\ 

the  L  inner  product  if  and  only  if  /3  >  —1.  Hint:  Use  the  analysis  following  (4.48). 

C  9.2.13.  The  equilibrium  equations  for  a  toroidal  membrane  (an  inner  tube)  lead  to  the  Poisson 
equation  —uxx  —  uyy  =  f(x,y)  on  a  rectangle  0<x<a,  0<y<6,  subject  to  periodic 
boundary  conditions 

u(x,  0)  =  u(x,  6),  uy(x,  0)  =  uy(x,  6),  u( 0,  y)  —  u(a ,  y),  UX(Q,  y)  —  ux(ai  d)- 

(a)  Prove  that  the  toroidal  boundary  value  problem  is  self-adjoint,  (b)  Is  it  positive  defi¬ 
nite,  positive  semi-definite,  or  neither?  (c)  Are  there  any  conditions  that  must  be  imposed 
on  the  forcing  function  f(x,y)  in  order  that  a  solution  exist? 

o 

^  9.2.14.  Find  the  adjoint  of  the  gradient  operator  V  with  respect  to  the  L  inner  product 

(9.22)  between  scalar  fields,  and  the  following  weighted  inner  product  between  (column) 

vector  fields  v  =  ( v1(x,  y),  v2(x,  y) )  ,  v  =  (v1(x,y),v2(x,y) )  : 

(( v  >  V ))  =  JJ  v(x,  y)TC(x,  y)  v(x,  y )  dx  dy, 

where  the  2x2  matrix  C(x,  y)  =  (  f  ^|  )  >  0  is  symmetric,  positive  definite  at 

all  points  (x,y)  G  D.  What  sort  of  boundary  conditions  do  you  need  to  impose?  Write  out 
the  corresponding  boundary  value  problem  for  the  equilibrium  equation  V*  °  Vw  =  /. 

o 

9.2.15.  Let  D  C  R  be  a  bounded  domain.  Construct  a  set  of  homogeneous  boundary  condi¬ 
tions  on  <9D  that  make  the  biharmonic  equation  A 2u  =  /:  (a)  self-adjoint,  (b)  positive 

definite,  (c)  positive  semi-definite,  but  not  positive  definite. 

9.2.16.  Write  down  the  boundary  value  problem  =  S ^  satisfied  by  the  modified  Green’s 

function  G^(x)  =  G(x]f)  given  in  (9.59).  Is  the  underlying  linear  operator  which  may 
depend  on  £,  self-adjoint  with  respect  to  a  suitable  inner  product? 


^  9.2.17.  Prove  symmetry  of  the  Green’s  function,  G(£;x)  =  G(x;£),  for  the  Poisson  equation 

on  a  bounded  domain  D  C  R  subject  to  homogeneous  Dirichlet  boundary  conditions. 
Hint:  Look  at  how  we  established  (9.56). 

9.2.18.  Generalize  Exercise  9.2.17  to  the  partial  differential  equation  (9.61). 
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9.3  Minimization  Principles 


One  of  the  most  important  features  of  positive  definite  linear  problems  is  that  their  solution 
can  be  characterized  by  a  quadratic  minimization  priniciple.  In  many  physical  contexts, 
equilibrium  conhguration(s)  serve  to  minimize  the  potential  energy  of  the  system.  Think 
of  a  small  ball  rolling  around  in  a  bowl.  After  frictional  effects  have  stopped  its  motion, 
the  ball  will  be  left  sitting  in  equilibrium  at  the  bottom  of  the  bowl  —  the  position  that 
minimizes  the  gravitational  potential  energy.  Minimization  priniciples  are  employed  in 
functional  analytic  proofs  of  existence  of  solutions,  as  well  as  providing  a  foundation  for 
the  powerful  finite  element  numerical  method  to  be  studied  in  Chapter  10. 

The  basic  theorem  on  quadratic  minimization  principles  is  as  follows. 

Theorem  9.24.  Let  S:U  U  be  a,  self-adjoint  and  positive  definite  linear  operator 
on  an  inner  product  space  U .  Suppose  that  the  linear  system 


S[u]  =  /  (9.63) 

admits  a  ( necessarily  unique)  solution  u*.  Then  u *  minimizes  the  value  of  the  associated 
quadratic  functional ) 


Q[u]  =  \  (u,S[u])  -  (f  ,u) 
meaning  that  Q[u^\  <  Q[u ]  for  all  admissible  u  ^  u*  in  U . 


(9.64) 


Proof :  We  are  given  that  S[u 
Q[u]  =  \  ( u  ,  S[u] )  —  (u,S[u^ 


/,  and  so,  for  any  u  E  [/, 

:  \  (u  -  U+ ,  S[u  -  u^])  -  \  (u*,S'[wJ 


(9.65) 


where  we  used  linearity,  along  with  our  assumption  that  S  is  self-adjoint,  to  identify  the 
terms  ( u  ,  S'frgJ )  =  ( u*  ,  S[u] ).  Since  S  >  0,  the  first  term  on  the  right-hand  side  of  (9.65) 
is  always  >  0;  moreover  it  equals  0  if  and  only  if  u  =  u*.  On  the  other  hand,  the  second 
term  does  not  depend  on  u  at  all.  Thus,  to  minimize  Q[u],  we  must  make  the  first  term 
as  small  as  possible,  which  is  accomplished  by  setting  u  =  u*.  Q.E.D. 


Example  9.25.  Consider  the  the  problem  of  minimizing  a  quadratic  function 


1 


n 


n 


9  T  kijuiuj  ~  E  fiUi  +c-< 


hj  =  1 


i  —  1 


(9.66) 


depending  on  n  variables  u  =  (u11u2l . . .  1un  )T  E  IRn,  with  fixed  real  coefficients  fc- •,  fi: 
and  c.  Since  =  u;- ui ,  we  can  assume,  without  loss  of  generality,  that  the  coefficients 
of  the  quadratic  terms  are  symmetric:  fc- •  =  We  rewrite  (9.66)  in  matrix  notation  as 


Q(u)  =  |  u  •  iFu  —  f  •  u  +  c,  (9.67) 

which,  apart  from  the  inessential  constant  term,  agrees  with  (9.64)  once  we  set  S[ u]  =  K u 
and  use  the  dot  product  (u,u)  =  u  •  u  as  the  inner  product  on  IRn.  Thus,  according  to 
Theorem  9.24,  if  K  is  a  symmetric  positive  definite  matrix,  then  the  quadratic  function 
(9.67)  has  a  unique  minimizer  u*  =  (ffif, . . . ,  n^)T,  which  is  the  solution  to  the  linear  system 
Ku*  =  f. 


If  the  positive  definite  linear  operator  in  Theorem  9.24  comes  from  the  self-adjoint 
construction  of  Theorem  9.20,  so  S  =  L*  oL,  then,  by  (9.51),  the  quadratic  term  can  be 


re-expressed  as  (u,S[u 


L[u ]  |||2,  using  our  notational  convention  (9.50)  for  the  norm 


on  the  target  space  V  of  L.  We  can  thus  rephrase  the  minimization  principle  as  follows. 
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Theorem  9.26.  Suppose  L:U  V  is  a  linear  operator  between  inner  product  spaces 
with  adjoint  L*:V  — ?►  U.  Assume  that  ker  L  =  {0},  and  let  S  =  L*  oL:U  — )►  U  be  the 
associated  positive  dehnite  linear  operator.  If  f  E  rng  S ,  then  the  quadratic  function 


Q[u]  =  \  III  L[u]  |||2  —  (  /  ,  u )  (9.68) 

has  a  unique  minimizer  u which  is  the  solution  to  the  linear  system  S'f'u]  =  /. 

Warning :  In  (9.68),  the  first  term  |||  L[u]  |||2  is  computed  using  the  norm  based  on  the 
inner  product  on  V ,  while  the  second  term  ( /  ,w)  employs  the  inner  product  on  U . 


One  of  the  most  important  applications  of  minimization  is  the  method  of  least  squares, 
which  is  extensively  applied  in  data  analysis  and  approximation  theory.  We  refer  the 
interested  reader  to  [89]  for  developments  in  this  direction.  Here  we  will  concentrate  on 
applications  to  differential  equations. 


Example  9.27.  Consider  the  boundary  value  problem 


—  u"  =  f(x),  u(a)  =  0,  u(b)  —  0.  (9.69) 

The  underlying  differential  operator  S  =  D*  oD  =  —  D2,  when  acting  on  the  space  of 
functions  satisfying  the  homogeneous  Dirichlet  boundary  conditions,  is  self-adjoint  and,  in 
fact,  positive  definite,  since  ker  D  =  {0}.  Explicitly,  positive  definiteness  requires 


(S[u 


u 


—  u"{x)u{x)  dx 


(9.70) 


for  all  nonzero  u{x)  ^  0  with  u(a)  =  u(b )  =  0.  Notice  how  we  used  an  integration  by  parts, 
invoking  the  boundary  conditions  to  eliminate  the  boundary  contributions,  to  expose  the 
positivity  of  the  integral.  The  associated  quadratic  functional  is,  using  (9.68), 


f(x)u(x) 


dx. 


Its  minimum  value,  taken  over  all  C2  functions  that  satisfy  the  homogeneous  Dirichlet 
boundary  conditions,  occurs  precisely  when  u  =  u*  is  the  solution  to  the  boundary  value 
problem. 


Sturm-Liouville  Boundary  Value  Problems 


The  most  important  class  of  boundary  value  problems  governed  by  second-order  ordinary 
differential  equations  was  first  systematically  investigated  by  the  nineteenth-century  French 
mathematicians  Jacques  Sturm  and  Joseph  Liouville.  A  Sturm-Liouville  boundary  value 
problem  is  based  on  a  second-order  ordinary  differential  equation  of  the  form 


+  q(x)  u 


du 

dx 


+  q(x)  u 


(9.71) 


on  a  bounded  interval  a  <  x  <  6,  supplemented  by  Dirichlet,  Neumann,  mixed,  or  periodic 
boundary  conditions.  To  avoid  singular  points  of  the  differential  equation  (although  we 
will  later  discover  that  most  cases  of  interest  have  one  or  more  singular  points),  we  assume 
here  that  p(x)  >  0  and,  to  ensure  positive  definiteness,  q{x)  >  0  for  all  a  <  x  <  b. 
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Sturm-Liouville  equations  and  boundary  value  problems  appear  in  a  remarkably  broad 
range  of  applications,  and  particularly  in  the  analysis  of  partial  differential  equations  by 
the  method  of  separation  of  variables.  Moreover,  most  of  the  important  special  functions, 
including  Airy  functions,  Bessel  functions,  Legendre  functions,  hypergeometric  functions, 
and  so  on,  naturally  appear  as  solutions  to  particular  Sturm-Liouville  equations,  [85,  86]. 
In  the  final  two  chapters,  the  analysis  of  basic  linear  partial  differential  equations  in  curvilin¬ 
ear  coordinates,  in  both  two  and  three  dimensions,  will  require  us  to  solve  several  particular 
examples,  including  the  Bessel,  Legendre,  and  Laguerre  equations.  For  now,  though,  we 
concentrate  on  understanding  how  Sturm-Liouville  boundary  value  problems  fit  into  our 
self-adjoint  and  positive  definite  framework. 

Our  starting  point  is  the  linear  operator 


(9.72) 


T 

that  maps  a  scalar  function  u(x)  E  U  to  a  vector- valued  function  v(x)  =  ( v1(x),v2(x) )  E 
V,  whose  components  are  v1  =  u' ,  v2  =  u.  To  compute  the  adjoint  of  L:  U  -E  V ,  we  use 
the  standard  L2  inner  product  (9.7)  on  [/,  but  adopt  the  following  weighted  inner  product 
on  V: 


v,v 


p(x)  v1(x)  v1(x)  +  q(x)  v2(x)  v2(x) 


(9.73) 


The  positivity  assumptions  on  the  weight  functions  p,  q  ensure  that  the  latter  is  a  bona 
fide  inner  product.  As  usual,  the  adjoint  computation  relies  on  integration  by  parts.  Here, 
we  only  need  to  manipulate  the  first  summand: 


((L[u 


fb 

/  (pu'v1  +  quv2)  dx 
J  a 

rb 

p{b)u{b)v1{b)—p(Ka)u(Ka)v1{a)Jr  /  u 

J  a 


(pv  i)'  +  qv2 


dx. 


The  boundary  terms  will  disappear,  provided  that,  at  each  endpoint,  either  u  or  v1  vanishes. 
Since  for  the  linear  operator  v  =  L[u]  given  by  (9.72),  we  can  identify  v1  =  u' :  we  conclude 
that  any  of  our  usual  boundary  conditions  —  Dirichlet,  mixed,  or  Neumann  —  remain  valid 
here.  Under  any  of  these  conditions, 


«  L[u] ,  v 

and  so  the  adjoint  operator  is  given  by 


fb 

/  u  [  — (pv-^Y  +  qv2]  dx 
J  a 


=  (u  ,L 


* 


v 


L 


d(pv  i) 
dx 


+  q  v2  =  —  p  v\  —  p'  u1  +  q  v 


2* 


The  canonical  self-adjoint  combination 


=  L*  o 


(9.74) 


then  reproduces  the  Sturm-Liouville  differential  operator  (9.71).  Moreover,  since  kerL  = 
{0}  is  trivial  (why?),  the  boundary  value  problem  is  positive  definite  for  all  boundary 
conditions ,  not  only  Dirichlet  and  mixed,  but  also  Neumann! 

A  proof  of  the  following  general  existence  theorem  can  be  found  in  [63]. 
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Theorem  9.28.  Let  p(x)  >  0  and  q(pc)  >  0  for  a  <  x  <  b.  Then ,  for  any  choice  of 
boundary  conditions  ( including  Neumann) ,  the  Sturm-Liouville  boundary  value  problem 
(9.71)  admits  a  unique  solution. 

Theorem  9.26  tells  us  that  the  solution  to  the  Sturm-Liouville  boundary  value  problem 
(9.71)  can  be  characterized  as  the  unique  minimizer  of  the  quadratic  functional 


Q[u]  =  i|||  L[u\  HI2  -(/,«)  = 


L  2 


\  p{x)  u'(x)2  +  \  q(x)  u(x)2  —  f{x)  u{x)  dx  (9.75) 


a 


among  all  C2  functions  satisfying  the  prescribed  homogeneous  boundary  conditions. 


Example  9.29.  Let  uj  >  0.  Consider  the  constant-coefficient  Sturm-Liouville  prob¬ 


lem 


—  u"  +  UJZ  u  =  f(x) 


n(0)  =  u(  1)  =  0. 


which  we  studied  earlier  in  Example  6.10.  Theorem  9.28  guarantees  the  existence  of  a 
unique  solution.  The  solution  achieves  the  minimum  possible  value  for  the  quadratic  func¬ 
tional 


Q[u 


o 


L  2 


\  u'2  +  \  uj2  u2  —  f  u]  dx 


among  all  C2  functions  satisfying  the  given  boundary  conditions. 
More  generally,  suppose  we  adopt  a  weighted  inner  product 


(u  ,u 


u{x)  u(x)  p(x)  dx 


(9.76) 


a 


on  the  domain  space  [/,  where  p(x)  >  0  on  [a,  b].  The  same  integration  by  parts  compu¬ 
tation  proves  that,  when  subject  to  the  homogeneous  boundary  conditions, 


L 


* 


v 


1 

P 


d(jrv  i) 

dx 


+  qv* 


p  /  v  .  q 

-  V1 - V1  +  ~V21 

p  p  p 


and  so  the  weighted  Sturm-Liouville  differential  operator  is 


S[u]  =  L  o  L[u]  =  — 

P 


d 

dx 


V 


du 

dx 


+  qu 


(9.77) 


The  corresponding  weighted  Sturm-Liouville  equation  S[u]  =  f  has  the  form 


S[u] 


p(x)  [  dx 


d  f  du  . 

p{x )  ——  +  q(x)  u 


dx 


p{x)  d2u  p\x)  du  q(x) 
p{x)  dx2  p{x)  dx  p{x) 


(9.78) 

which  is,  in  fact,  identical  to  the  ordinary  Sturm-Liouville  equation  (9.71)  after  we  replace 
/  by  pf.  Be  that  as  it  may,  the  weighted  generalization  will  become  important  when  we 
study  the  associated  eigenvalue  problems. 

Example  9.30.  Let  m  >  0  be  a  fixed  positive  number.  Consider  the  differential 
equation 


B 


u 


n 


=  ~U" - U'  + 

X 


rrv 


X‘ 


u  =  f(x), 


(9.79) 
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where  B  is  known  as  the  Bessel  differential  operator  of  order  m.  To  place  it  in  weighted 
Sturm-Liouville  form  (9.78),  we  must  find  p(x),q(x),  and  p(x)  that  satisfy 

q(x)  m2 
p(x)  x2 

Dividing  the  second  equation  by  the  first,  we  see  that  p'(x)/p(x)  =  1/x,  and  hence  we  can 
set 

TR2 

p(x)  =  x,  q(x)  =  —  ,  p(x)  =  x. 

Thus,  when  subject  to  homogeneous  Dirichlet,  mixed,  or  even  Neumann  boundary  condi¬ 
tions  on  an  interval  0  <  a  <  x  <  6,  the  Bessel  operator  B  is  positive  definite  and  self-adjoint 
with  respect  to  the  weighted  inner  product 


p(x)  _  p'(x)  _  1 

p(x)  ’  p(x)  x  ’ 


(u  ,u) 


u(x)  u(x )  x  dx. 


(9.80) 


Exercises 


9.3.1.  Consider  the  boundary  value  problem  —u"  =  x,  'u(O)  =  u(l)  =  0.  (i)  Find  the  solution. 
(ii)  Write  down  a  minimization  principle  that  characterizes  the  solution,  (in)  What  is 
the  value  of  the  minimized  quadratic  functional  on  the  solution?  (iv)  Write  down  at  least 
two  other  functions  that  satisfy  the  boundary  conditions  and  check  that  they  produce  larger 
values  for  the  energy. 


9.3.2.  Answer  Exercise  9.3.1  for  the  boundary  value  problems 


(a) 


d 


1 


du 


X  f\f 


x  ,  u(— 1)  —  u(  1)  =  0;  (b)  —  (e u  ) 


X 


a(0)  =  u  (  1)  =  0; 


dx  \  1  +  x2  dx 

(c)  x2 u"  +  2 xu  =  3x2,  u  (1)  —  u( 2)  =  0;  (d)  xu"  +  3 u  =  1,  u(— 2)  =  u(— 1)  =  0. 

9.3.3.  Let  Q[u]  =  f  ^(u)2  —  5  u  dx.  (a)  Find  the  function  u^(x)  that  minimizes  Q[u\ 

j  o 

among  all  C2  functions  that  satisfy  a(0)  =  u(  1)  =  0. 

(b)  Test  your  answer  by  computing  Q[u*]  and  then  comparing  with  the  value  of  Q[u\  when 

O  Q  Q  Q  O  0/1 

u(x)  =  (i)  x  —  x  ,  (ii)  ^x—^x  ,  (Hi)  |sin7rx,  (iv)  x  —  x  . 

9.3.4.  For  each  of  the  following  functionals  and  associated  boundary  conditions:  (i)  write  down 
a  boundary  value  problem  satisfied  by  the  minimizing  function,  and  (ii)  find  the  minimizing 

function  u^(x): 
l  r 


(a)  f  ^(u')2  —  3u  dx ,  a(0)  =  u(l)  =  0. 

J  o 


(o 

(c) 

(d) 

(e) 


•1  r 


r0 


\  (x  +  1)  (u)2  —  5  u  dx ,  u(0)  =  u(  1)  =  0. 


x(u)2  -\-2u  dx ,  u(l)  =  a(3)  =  0. 


ro 


\  ex  (u)2  —  (1  +  ex)  u  dx ,  a(0)  =  u(  1)  =  0, 


1  (xz  +  1)  (u)2  +  xu 


-1 


(x2  +  l)2 


dx,  tx(— 1)  =  u(  1)  =  0. 
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9.3.5.  Which  of  the  following  quadratic  functionals  possess  a  unique  minimizer  among  all  Cf 

functions  satisfying  the  indicated  boundary  conditions?  Find  the  minimizer  if  it  exists. 

„2 

^  x  (u)2  +  2  (x  —  1)  u  dx ,  u(l)  =  u(2)  =  0; 


(a) 

(b) 

(c) 

(d) 

(e) 


>7T 


7 r  L 


1  L 


-2 

f1' 

r0 


^  x(u  )  —u  cosx  dx ,  u(— 7r)  =  U ( 7T )  =  0: 

(t/)2  cosx  —  sinx  dx,  u(—l)=u(l)  =  0: 

u(- 2)  =  fz(2)  =  0; 
dx,  i/(0)  =  i/(l)  =  0. 


(1  —  x2)  ( u )2  —  u  dx, 


(x  +  1)  (V)2 


9.3.6.  Let  D[u\  =  v!  be  the  derivative  operator  acting  on  the  vector  space  of  C2  scalar  func¬ 
tions  u(x)  defined  for  0  <  x  <  1  and  satisfying  the  boundary  conditions  'iz(O)  =  0,  u  (1)  =  0. 

r  1 

(a)  Given  the  weighted  inner  product  (u  ,u)  =  /  a(x)  u(x)  ex  dx  on  both  its  domain  and 

J  o 

target  spaces,  determine  the  corresponding  adjoint  operator  79*. 

(b)  Let  S  =  Z9*  of).  Write  down  and  solve  the  boundary  value  problem  S[u]  =  3eG 

(c)  Write  down  a  minimization  principle  that  characterizes  the  solution  you  found  in  part 
(b),  or  explain  why  none  exists. 

9.3.7.  Solve  the  Sturm-Liouville  boundary  value  problem  —Au"  +  9u  =  1,  a(0)  =  0,  u(2)  =  0. 
Is  your  solution  unique? 

9.3.8.  Answer  Exercise  9.3.7  for  the  Neumann  boundary  conditions  i/(0)  =  0,  u  (2)  =  0. 

9.3.9.  (i)  Write  the  following  differential  equations  in  Sturm-Liouville  form,  (ii)  If  possible, 
write  down  a  minimization  principle  that  characterizes  the  solutions  to  the  Dirichlet  bound¬ 
ary  value  problem  on  the  interval  [1,  2].  (a)  —  ex  u"  —  ex  u  =  e2x ,  (b)  —  xu' —  u  +2u  =  l, 
(c)  —u"  —  2u'-\~u  =  ex,  (d)  -x2w//  +  2xw/  +  3w  =  1,  (e)  xw//  +  (l-x)w/  +  w  =  0. 

9.3.10.  True  or  false:  The  Sturm-Liouville  operator  (9.71)  is  self-adjoint  and  positive  definite 
when  subject  to  periodic  boundary  conditions  u(a)  =  u(b),  u  (a)  =  u  (b). 

Pi  r 

9.3.11.  Does  the  quadratic  functional  Q[u]  = 


ro 


h  (uf  -(x-  \)u 


dx  have  a  minimum 


value  when  u(x)  is  subject  to  the  homogeneous  Neumann  boundary  value  conditions 
i/(0)  =  ^(1)  =0?  If  so,  determine  the  minimum  value  and  find  all  minimizing  functions. 

T  9.3.12.  (a)  Determine  the  adjoint  of  the  differential  operator  L[u]  =  u  +  2 xu  with  respect  to 
the  L2  inner  products  on  [0, 1]  when  subject  to  the  fixed  boundary  conditions 
a(0)  =  u{l)  =  0.  (b)  Is  the  self-adjoint  operator  A  =  L*  of  positive  definite?  Explain  your 

answer,  (c)  Write  out  the  boundary  value  problem  represented  by  S[u]  =  /.  (d)  Find  the 

2 

solution  to  the  boundary  value  problem  when  /(x)  =  ex  .  Hint:  To  integrate  the 
differential  equation,  work  with  the  factored  form  of  the  differential  operator,  (e)  Discuss 
what  happens  if  you  instead  impose  the  Neumann  boundary  conditions  i/(0)  =  u  (T)  =  0. 

9.3.13.  Discuss  the  self- adjoint  ness  and  positive  definiteness  of  boundary  value  problems  associ¬ 
ated  with  the  Bessel  operator  (9.79)  of  order  m  =  0. 

9.3.14.  Let  u*(x)  be  the  solution  to  the  self-adjoint  positive  definite  boundary  value  problem 

=  /.  Prove  that  if  /(x)  ^  0,  then  the  minimum  of  the  associated  quadratic  functional 
is  strictly  negative:  Q[u*}  <  0. 

r 1  n 

9.3.15.  Find  a  function  u(x)  such  that  /  u  (x)  u(x)  dx  >  0.  How  do  you  reconcile  this  with 

the  claimed  positivity  in  (9.70)?  0 
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9.3.16.  Does  the  inequality  (9.70)  hold  when  u(x)  ^  0  is  subject  to  the  Neumann  boundary 
conditions  u'  (a)  =  u  (b)  =  0? 

9.3.17.  True  or  false:  When  subject  to  homogeneous  Dirichlet  boundary  conditions  on  an  in¬ 
terval  [a,  6],  every  nonsingular  second-order  linear  ordinary  differential  equation 

a{x)u"  +  b(x)u  +  c(x)u  =  f(x)  is  (a)  self-adjoint,  (b)  positive  definite,  (c)  positive 
semi-definite,  with  respect  to  some  weighted  inner  product  (9.76). 


The  Dirichlet  Principle 


Let  us  now  apply  these  ideas  to  boundary  value  problems  governed  by  the  Poisson  equation 

—  A  u  =  V*°Vu  =  /.  (9.81) 

In  the  positive  definite  cases  in  which  the  partial  differential  equation  is  supplemented 
by  either  homogeneous  Dirichlet  or  homogeneous  mixed  boundary  conditions,  our  general 
Minimization  Theorem  9.24  implies  that  the  solution  can  be  characterized  by  the  justly 
famous  Dirichlet  Principle. 

Theorem  9.31.  The  function  u{x,y)  that  minimizes  the  Dirichlet  integral 


/  /  (\u2x  +  \  u2  -  f  u)  dx  dy 

J  Jn 


(9.82) 


among  all  C2  functions  that  satisfy  the  prescribed  homogeneous  Dirichlet  or  mixed  bound¬ 
ary  conditions  is  the  solution  to  the  corresponding  boundary  value  problem  for  the  Poisson 
equation  —  A  u  =  f. 


The  fact  that  a  minimizer  to  the  Dirichlet  integral  (9.82)  satisfies  the  Poisson  equation 
is  an  immediate  consequence  of  our  general  Minimization  Theorem  9.26.  On  the  other 
hand,  proving  the  existence  of  a  C2  minimizing  function  is  a  nontrivial  issue.  Indeed, 
the  need  for  a  rigorous  existence  proof  was  not  immediately  recognized:  arguing  from  the 
finite-dimensional  situation,  Dirichlet  deemed  existence  to  be  self-evident,  but  it  was  not 
until  50  years  later  that  Hilbert  supplied  the  first  rigorous  proof  —  which  was  one  of  his 
primary  motivations  for  introducing  the  mathematical  machinery  of  Hilbert  space. 

The  Dirichlet  principle  (9.82)  was  derived  under  the  assumption  that  the  boundary 
conditions  are  homogeneous  —  either  pure  Dirichlet  or  mixed.  As  it  turns  out,  the  mini¬ 
mization  principle,  as  stated,  also  applies  to  the  inhomogeneous  Dirichlet  boundary  value 
problem.  However,  the  minimizing  functional  that  characterizes  the  solution  to  a  mixed 
boundary  value  problem  with  inhomogeneous  Neumann  conditions  on  part  of  the  boundary 
acquires  an  additional  boundary  term. 


Theorem  9.32.  The  solution  u(x,y)  to  the  boundary  value  problem 

du 


—  A u  —  f  in  Q ,  u  =  h  on  D  c  dQ, 


dn 


=  k  on  N  =  dD\D, 


(9.83) 


with  D  yf  0,  is  characterized  as  the  unique  function  that  minimizes  the  modified  Dirichlet 
integral 

Q[u]=  (^ux  +  kuy  ~  f u)  dx  dy  —  /  kuds  (9.84) 

J  Jn  Jn 

among  all  C2  functions  that  satisfy  the  prescribed  boundary  conditions. 
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In  particular,  the  inhomogeneous  Dirichlet  problem  has  N  =  0,  in  which  case  the 
extra  boundary  integral  does  not  appear. 

Proof :  Write  u(x,  y )  =  u(x,  y)+v(x,  y),  where  v  is  any  function  that  satisfies  the  given 
boundary  conditions:  v  =  h  on  D,  while  dv/dn  =  k  on  N.  (We  specifically  do  not  require 
that  v  satisfy  the  Poisson  equation.)  Their  difference  u  =  u  —  v  satisfies  the  corresponding 
homogeneous  boundary  conditions,  along  with  the  modified  Poisson  equation 


—  Au  =  f  =  f  +  Ac  in  Q . 


u  =  0  on  D . 


du 

dn 


=  0  on  N. 


Theorem  9.31  implies  that  u  minimizes  the  Dirichlet  functional 


Q[u]  =  \  HI  |||2  -  ((f,u)}  =  //  (|n2  +  \u2  -  f  u)  dxdy 

J  Jn 

among  all  functions  satisfying  the  homogeneous  boundary  conditions.  We  compute 

Q[u]  =  Q[u  —  v]  =  \  HI  V  —  v)  |||2  —  (/  +  Av  ,  u  —  v ) 

=  \  HI  \7u  || |2  —  ( \7u ,  Vv )  +  \  HI  Vv  || |2  —  (  /  ,  u )  —  ( Av  ,  u )  +  (  /  +  Av  ,  v 

=  Q[u]  —  /  /  (X7u  •  \7v  +  u  Av)  dx  dy  +  C0, 

J  Jn 


where 


C0  =  \  III  Vi;  |||2  +  (  /  +  Av  ,  v 


does  not  depend  on  u.  We  then  apply  formula  (6.83)  to  evaluate  the  middle  terms: 


/  /  (\7u  •  \7v  +  u  Av)  dx  dy  =  (D 

J  Jn  Jo 


dv  f  dr  f 

u  ——  as  =  I  h  — —  as  +  uk  as. 
on  J  D  on  JN 


Thus. 


Q[u]  =  Q[u]  —  /  k  u  ds  -\-  C±  =  Q[u]  + 

Jn 

f  dv 

where  the  final  term  C,  =  Cn  +  /  h  —  ds  is  fixed  by  the  boundary  conditions  and  the 

Jn  on 

choice  of  v,  and  so^its  value  does  not  change  when  the  function  u  is  varied.  We  conclude 
that  u  minimizes  Q[u]  if  and  only  if  u  =  u  +  v  minimizes  Q[u].  Q.E.D. 


Exercises 

T  9.3.18.  (a)  Show  that  the  function  u(x,  y)  =  \  (—  xy  +  xy2  +  x2  y  —  x2  y2)  solves  the  homoge- 

o  o 

neous  Dirichlet  boundary  value  problem  for  the  Poisson  equation  —  Au  =  x  +  y  —  x  —  y 
on  the  unit  square  S  =  {0  <  x  <  1,  0<y<l}.  (b)  Write  down  the  Dirichlet  integral 
(9.82)  for  this  boundary  value  problem.  What  is  its  value  for  your  solution?  (c)  Write 
down  three  other  functions  that  satisfy  the  homogeneous  Dirichlet  boundary  conditions  on 
S,  and  check  that  all  three  have  larger  Dirichlet  integrals. 

9.3.19.  (a)  Suppose  u(x,y)  solves  the  boundary  value  problem  —  Au  =  f  in  D  and  u  =  0  on 
<9D,  with  f(x,y)  ^  0.  Prove  that  its  Dirichlet  integral  (9.82)  is  strictly  negative:  Q[u]  <  0. 

(b)  Does  this  result  hold  for  the  inhomogeneous  boundary  value  problem  u  =  h  on  <9D? 
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C  9.3.20.  Consider  the  boundary  value  problem  —  A u  =  1,  x2  +  y2  <  1,  u  =  0,  x2  +  y2  =  1. 

(a)  Find  all  solutions,  (b)  Formulate  the  Dirichlet  minimization  principle  for  this  problem. 
Carefully  indicate  the  function  space  over  which  you  are  minimizing.  Make  sure  your  so¬ 
lution  belongs  to  the  function  space,  (c)  Which  of  the  following  functions  belong  to  your 

function  space?  (z)  l  —  x2—y2,  ( ii )  1  —  \x2  —  ^y2,  (in)  x  —  x^  —  xy2,  (iv)  x4  —  x2y2+y4z1 

(v)  ^e~x  ~y  —  £  e~  .  (d)  For  each  function  in  part  (c)  that  does  belong  to  your  function 
space,  verify  that  its  Dirichlet  integral  is  larger  than  your  solution’s  value. 

9.3.21.  Suppose  A  >  0.  Under  what  conditions  does  the  inhomogeneous  Neumann  problem 
—  A u  +  Xu  =  /  in  D,  du/d n  =  k  on  <9D,  for  the  Helmholtz  equation  have  a  solution? 

Is  the  solution  unique?  Hint :  Is  the  boundary  value  problem  positive  definite? 

0  9.3.22.  Suppose  n(x)  >  0  for  all  a  <  x  <  b. 


(a)  Prove  that  the  solution  u*(x)  to  the  inhomogeneous  Dirichlet  boundary  value  problem 


d_ 

dx 


u(a)  =  a,  u(b)  =  /?, 


a  L 


\  K,(x)  U  (x)2  —  f(x)  u(x) 


dx. 


minimizes  the  functional  Q[u]  = 

Hint :  Mimic  the  proof  of  Theorem  9.32. 

(b)  Construct  a  minimization  principle  for  the  mixed  boundary  value  problem 


d_ 

dx 


u  (b)  =  f3. 


9.3.23.  Use  the  result  of  Exercise  9.3.22  to  find  the  C2  function  u  (x)  that  minimizes  the 


integral  Q[u]  =  J 
u(l)  =  0,  u( 2)  =  1. 


x  (  du 
2  \  dx 


i  2 

+  X  u 


dx  when  subject  to  the  boundary  conditions 


9.3.25.  Prove  that  the  functional  Q[u 


r  2  /  2  2 

9.3.24.  Find  the  function  u(x)  that  minimizes  the  integral  Q[u\  =  J  [x(u  )  +  x  u]  dx  subject 
to  the  boundary  conditions  u(  1)  =  1,  u  ( 2)  =  0.  Hint :  Use  Exercise  9.3.22(b). 

r 1  /  2 

=  yo  (u)  dx ,  when  subject  to  the  mixed  boundary 
conditions  tt(0)  =  0,  u  (1)  =  1,  has  no  minimizer. 

C  9.3.26.  Let  p1  (x,  y),p2(x,  y),  q(x,  y)  >  0  be  strictly  positive  functions  on  a  closed,  bounded, 

connected  domain  OcM  .  Consider  the  boundary  value  problem  for  the  second-order 
partial  differential  equation 


d  (  ,  N du 


dx 


d  (  .  N du 

WvV^v)Tv 


+  q(x,y)u  =  f(x,y),  (x,y)€Cl,  (9.85) 


subject  to  homogeneous  Dirichlet  boundary  conditions  u  =  0  on  d$d. 

(a)  True  or  false:  Equation  (9.85)  is  an  elliptic  partial  differential  equation,  (b)  Write 
the  boundary  value  problem  in  self-adjoint  form  L*  o  L[u\  =  /.  Hint:  Regard  (9.85)  as  a 
“two-dimensional  Sturm-Liouville  equation”,  (c)  Prove  that  this  boundary  value  problem 
is  positive  definite,  and  then  find  a  minimization  principle  that  characterizes  the  solution, 
(d)  Find  suitable  homogeneous  Neumann-type  boundary  conditions  involving  the  values  of 
the  derivatives  of  u  on  dTt  that  make  the  resulting  boundary  value  problem  for  (9.85)  self- 
adjoint.  Is  your  boundary  value  problem  positive  definite?  Why  or  why  not? 
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We  have  already  come  to  appreciate  the  value  of  eigenfunctions  for  constructing  separable 
solutions  to  dynamical  partial  differential  equations  such  as  the  one-dimensional  heat  and 
wave  equations.  In  both  cases,  the  eigenfunctions  are  trigonometric,  and  are  used  to 
write  the  solution  to  the  initial  value  problem  in  the  form  of  a  Fourier  series.  The  most 
important  feature  is  that  the  Fourier  eigenfunctions  are  orthogonal  with  respect  to  the 
underlying  L2  inner  product.  As  we  remarked  earlier,  orthogonality  is  not  an  accident. 
Rather,  it  is  a  direct  consequence  of  the  self-adjointness  of  the  linear  differential  operator 
prescribing  the  eigenvalue  equation.  The  goal  of  this  section  is,  in  preparation  for  extending 
the  eigenfunction  method  to  higher-dimensional  and  more  general  dynamical  problems, 
to  establish  the  orthogonality  property  of  eigenfunctions  in  general,  discuss  how  positive 
(semi-)dehniteness  affects  the  eigenvalues,  and  present  the  basic  theory  of  eigenfunction 
series  expansions,  thereby  significantly  generalizing  basic  Fourier  series.  As  an  application, 
we  deduce  a  general  formula  for  the  Green’s  function  of  a  positive  definite  boundary  value 
problem  as  an  infinite  series  in  the  eigenfunctions,  and  use  this  to  formulate  a  condition  that 
guarantees  completeness  of  the  eigenfunctions.  Along  the  way,  we  also  need  to  introduce  an 
important  minimization  principle,  the  Rayleigh  quotient,  that  characterizes  the  eigenvalues 
of  a  positive  definite  linear  system. 

We  begin  with  the  eigenvalue  problem 

5[u]  =  A  v  (9.86) 

for  a  linear  operator  S:U  — )►  U  on^  a  real  or  complex  vector  space  U .  Clearly,  v  =  0  solves 
the  eigenvalue  equation  no  matter  what  the  scalar  A  is.  If  the  homogeneous  linear  system 
(9.86)  admits  a  nonzero  solution  0  /  d  G  [/,  then  A  G  C  is  called  an  eigenvalue  of  the 
operator  S  and  v  a  corresponding  eigenvector  or  eigenfunction ,  depending  on  the  context. 
If  A  is  an  eigenvalue,  then  the  corresponding  eigenspace  is  the  subspace 

Vx  =  kei(S-Xl)  =  {v\  S[v}  =  \v}  cU,  (9.87) 

consisting  of  all  the  eigenvectors/eigenfunctions  along  with  the  zero  element.  To  avoid 
technical  difficulties,  we  will  work  under  the  assumption  that  all  the  eigenspaces  are  finite¬ 
dimensional,  and  we  call  1  <  dimRA  <  oo  the  geometric  multiplicity  of  the  eigenvalue 
A.  Finite-dimensionality  is  almost  always  valid,  and  indeed,  will  be  later  established  for 
regular  boundary  value  problems  on  bounded  domains. 


Self-Adjoint  Operators 

In  the  applications  considered  here,  the  vector  space  U  comes  equipped  with  an  inner 
product,  and  S  is  a  self-adjoint  linear  operator.  In  such  instances,  one  can  readily  establish 
the  basic  orthogonality  property  of  the  eigenvectors/eigenfunctions. 

Theorem  9.33.  If  S  =  S*  is  a  self-adjoint  linear  operator  on  an  inner  product  space 
U,  then  all  its  eigenvalues  are  real.  Moreover ,  the  eigenvectors/eigenfunctions  associated 
with  different  eigenvalues  are  automatically  orthogonal. 


^  As  discussed  earlier,  in  the  infinite-dimensional  case,  the  differential  operator  S  might  be 
only  defined  on  a  dense  subspace  of  U  consisting  of  sufficiently  smooth  functions. 
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Proof :  To  prove  the  first  part  of  the  theorem,  suppose  A  is  a  complex  eigenvalue, 
so  that  S[v]  =  Xv  for  some  complex  eigenvector /eigenfunction  v  7^  0.  Then,  using 
the  sesquilinearity  (B.19)  of  the  underlying  Hermitian  inner  product^  and  self-adiointness 
(9.45)  of  S',  we  find 


A 


(Xv,v) 


(sy]  iv) 


Since  r  /  0,  this  immediately  implies  that  A 
necessarily  be  real. 

To  prove  orthogonality,  suppose  S[u  = 


=  (u,S[u])  =  (v , Xv)  =  X 


V 


A,  its  complex  conjugate,  and  hence  A  must 


Xu  and  S[v]  =  fiv.  Again  by  self- adjoint  ness. 


A  (u  ,v)  =  (A  u  ,v)  =  (S[n],u)  =  (n,S[u])  =  (u  ,  fiv)  =  /i  (u  ,v), 

where  the  hnal  equality  relies  on  the  fact  that  the  eigenvalue  g  is  real.  Therefore,  the 
assumption  that  A  ^  (i  immediately  implies  orthogonality:  (u,v)  =0.  Q.E.D. 

Thus,  the  eigenvalues  of  self-adjoint  linear  operators  are  necessarily  real.  If,  in  addi¬ 
tion,  the  operator  is  positive  definite,  then  its  eigenvalues  must,  in  fact,  be  positive. 


Theorem  9.34.  If  S  >  0  is  a  self-adjoint  positive  definite  linear  operator ,  then  all  its 
eigenvalues  are  strictly  positive:  A  >  0.  If  S  >  0  is  self-adjoint  and  positive  semi- definite, 
then  its  eigenvalues  are  nonnegative:  A  >  0. 

Proof:  Self-adjointness  assures  us  that  all  of  the  eigenvalues  are  real.  Suppose  S[u  = 
X  u  with  u/0  a  real  eigenfunction.  Then 


A 


u 


X  (u  ,u)  =  (A  u  ,u)  =  (S[u],u)  >  0. 


by  positive  definiteness.  Since 


u 


>  0,  this  immediately  implies  that  A  >  0.  The  same 


argument  implies  that  A  >  0  in  the  positive  semi-definite  case. 


Q.E.D. 


All  the  linear  operators  to  be  considered  in  this  text  are  real,  and,  at  the  very  least, 
self-adjoint,  and  often  either  positive  definite  or  semi-definite.  Thus,  we  will  restrict  our 
attention  from  here  on  (at  least  until  we  reach  the  Schrodinger  equation  in  the  final  sub¬ 
section)  to  real  operators  defined  on  real  vector  spaces,  knowing  a  priori  that  we  are  not 
overlooking  any  eigenvalues  or  eigenfunctions  by  this  restriction. 


Example  9.35.  In  finite  dimensions,  if  we  equip  U  =  Mn  with  the  dot  product, 
then  any  self-adjoint  linear  function  is  given  by  multiplication  by  an  n  x  n  symmetric 
matrix:  S[u]  —  K u,  where  KT  =  K.  Theorem  9.33  implies  the  well-known  result  that 
a  symmetric  matrix  has  only  real  eigenvalues.  Moreover,  the  eigenvectors  associated  with 
different  eigenvalues  are  mutually  orthogonal. 

In  fact,  it  can  be  proved  that,  in  general,  the  eigenvectors  of  a  symmetric  matrix  are 
complete,  [89].  In  other  words,  there  exists  an  orthogonal  basis  v1, . . . ,  vn  of  IRn  consisting 
of  eigenvectors  of  K,  so  Kv-  =  A  -  v  •  for  j  =  1, . . . ,  n.  If  the  eigenvalues  A1? . . . ,  An  are 
all  simple,  so  A-  7^  A  •  for  i  7^  j,  then  the  basis  eigenvectors  are  automatically  orthogonal. 
When  K  has  repeated  eigenvalues,  this  requires  selecting  an  orthogonal  basis  of  each  of 
the  associated  eigenspaces  Vx  =  ker (K  —  A  I),  e.g.,  using  the  Gram-Schmidt  process. 


^  We  are  temporarily  working  in  the  vector  space  of  complex- valued  functions.  Once  we 
establish  reality  of  the  eigenvalues  and  eigenfunctions,  we  can  shift  our  focus  back  to  the  real 
function  space. 
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Completeness  implies  that  the  number  of  linearly  independent  eigenvectors  associated  with 
an  eigenvalue,  i.e.,  its  geometric  multiplicity,  is  the  same  as  the  eigenvalue’s  algebraic 
multiplicity.  If,  furthermore,  the  matrix  K  >  0  is  symmetric  and  positive  definite,  then 
Theorem  9.34  implies  that  all  its  eigenvalues  are  positive:  A  •  >  0.  In  this  case,  thanks 
to  completeness,  the  converse  is  also  valid:  a  symmetric  matrix  is  positive  definite  if  and 
only  if  it  has  all  positive  eigenvalues.  These  results  can  all  be  immediately  generalized  to 
self-adjoint  matrices  under  general  inner  products  on  IRn. 


Example  9.36.  Consider  the  Dirichlet  eigenvalue  problem 

d?v 

-  -T-J  =  At),  v(0)  =  0,  v(£)  =  0, 

for  the  differential  operator  S  =  —  D2  on  an  interval  of  length  ('  >  0.  As  we  know  —  see, 
for  instance,  Section  4.1  —  the  eigenvalues  and  eigenfunctions  are 


A 


n 


vn(x)  =  sin 


nirx 


t 


n  = 


1,2,3,.... 


We  now  understand  this  example  in  our  general  framework.  The  fact  that  the  eigenvalues 
are  real  and  positive  follows  from  the  fact  that  the  boundary  value  problem  is  defined  by 
the  self-adjoint  positive  definite  operator 


S[u]  =  D*oD[u]  =  -D 


,2r 


U 


=  —  un , 


acting  on  the  vector  space  U  =  {u(0)  =  u(£)  =  0},  equipped  with  the  L2  inner  product: 


f£ 

u,v)=  /  u(x)v(x)dx. 

Jo 


The  orthogonality  of  the  Fourier  sine  functions. 


a 


v  v 

m  ’  n 


0 


.  mux  .  T17TX 

sm  — - —  sm  — - —  ax  —  0 


for 


m  7^  n. 


is  also  an  automatic  consequence  of  their  status  as  eigenfunctions  of  this  self-adjoint  bound¬ 
ary  value  problem. 


Example  9.37.  Similarly,  the  periodic  boundary  value  problem 


—  v"  =  \v,  v(—  tt)  =  u(7r),  v\—7[)=v'(l r),  (9.88) 

has  eigenvalues  A0  =  0,  with  eigenfunction  vQ(x)  =  1,  and  An  =  n2,  for  n—  1,  2,  3, ... ,  each 
possessing  two  independent  eigenfunctions:  vn(x)  =  cos nx  and  vn(x)  =  sinnx.  In  this 
case,  a  zero  eigenvalue  appears  because  S  =  D*  oD  =  —  D2  is  only  positive  semi-definite 
on  the  space  of  periodic  functions.  Theorem  9.33  implies  the  all-important  orthogonality  of 
the  Fourier  eigenfunctions  corresponding  to  different  eigenvalues:  {vm^vn)  =  ( vrn  ,  vn )  = 
(vm  ,  vn  )  =  0  for  m  ^  n,  under  the  L2  inner  product  on  [  — 7r,  7r].  However,  since  they  have 
the  same  eigenvalue,  the  orthogonality  of  vn(x)  =  cos  nx  and  vn(x)  =  sin  nx,  while  true, 
is  not  ensured  and  must  be  checked  by  hand. 


Example  9.38.  On  the  other  hand,  the  self-adjoint  boundary  value  problem 


lim  u(x)  =  0, 

x  — >  oo 


lim  u{x)  =  0, 

x  — >  —  oo 


(9.89) 
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on  the  real  line  has  no  eigenvalues:  no  matter  what  the  value  of  A,  the  only  solution 
decaying  to  0  at  both  =t  oo  is  the  zero  solution.  Indeed,  exponential  solutions  that  decay  at 
one  end  become  infinitely  large  at  the  other.  The  trigonometric  functions  u(pc)  =  cosux 
and  sincex  satisfy  the  differential  equation  when  A  =  uj  2  >  0,  but  do  not  go  to  zero  as 
x  |  oo,  and  so  do  not  qualify  as  bona  fide  eigenfunctions.  Rather,  because  they  are 
bounded  on  the  entire  line,  they  represent  the  “continuous  spectrum”  of  the  underlying 
self-adjoint  differential  operator,  [95].  In  this  particular  context,  the  continuous  spectrum 
leads  directly  to  the  Fourier  transform. 


Example  9.39.  The  eigenvalue  problem  for  the  Bessel  differential  operator  of  order 
m,  given  in  (9.79),  is  governed  by  the  following  differential  equation: 


1  f  771 2 

—  U  T  — 7T  U  =  X  77, 


(9.90) 


or,  equivalently, 

9  cPxl  du  /  A  9  9x 

X  — r  +  X  — - h  (A  X  —  777  )  u  =  0, 

dxz  dx 

supplemented  by  appropriate  homogeneous  boundary  conditions  at  the  endpoints  of  the 
interval  0  <  a  <  b.  Its  eigenfunctions  are  not  elementary,  but,  as  we  will  learn  in  Chap¬ 
ter  11,  can  be  expressed  in  terms  of  Bessel  functions.  Nevertheless,  no  matter  what  their 
eventual  formula,  Theorem  9.33  guarantees  jthe  orthogonality  of  any  two  eigenfunctions 
v,v  associated  with  distinct  eigenvalues  A  ^  A  under  the  weighted  inner  product  (9.80): 


(v,v)=  /  v(x)  v(x)  x  dx  =  0. 

J  a 

Example  9.40.  According  to  equation  (9.60),  on  a  bounded  domain  O  C  IR2,  the 
(negative)  Laplacian  —  A  forms  a  self-adjoint  positive  (semi-)dehnite  operator  under  the 
L2  inner  product  (9.22)  when  subject  to  one  of  the  usual  sets  of  homogeneous  boundary 
conditions.  Let  us,  for  specificity,  concentrate  on  the  Dirichlet  case.  The  eigenfunctions  of 
the  Laplacian  are  the  nonzero  solutions  to  the  following  boundary  value  problem: 


—  Av  =  Xv  in  O,  77  =  0  on  <90.  (9.91) 

The  underlying  partial  differential  equation,  namely 


d2v  d2v 
dx 2  dy 2 


0 


is  known  as  the  Helmholtz  equation ,  named  after  the  influential  and  wide-ranging  German 
applied  mathematician  Hermann  von  Helmholtz.  As  we  will  see,  the  Helmholtz  equation 
plays  a  central  role  in  the  solution  of  the  two-dimensional  heat,  wave,  and  Schrodinger 
equations. 

Only  in  a  few  special  cases,  e.g.,  rectangles  and  circular  disks,  can  the  eigenfunc¬ 
tions  and  eigenvalues  be  determined  exactly;  see  Chapter  11  for  details.  Nevertheless, 
Theorem  9.34  guarantees  that,  for  all  domains,  the  eigenvalues  are  always  nonnegative, 
A  >  0,  with  A0  =  0  being  an  eigenvalue  only  in  positive  semi-definite  cases,  e.g.,  Neu¬ 
mann  boundary  conditions.  Moreover,  Theorem  9.33  ensures  the  orthogonality  of  any  two 
eigenfunctions, 

{v  ,v)  =  //  v(x,  y)  v(x,  y)  dx  dy  =  0, 

J  Jn 


that  are  associated  with  distinct  eigenvalues  A  ^  X . 
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The  Rayleigh  Quotient 


We  have  already  learned  how  to  characterize  the  solutions  of  positive  definite  boundary 
value  problems  by  a  minimization  principle.  One  can  also  characterize  their  eigenvalues 
by  a  minimization  principle,  named  after  the  prolific  nineteenth-century  English  applied 
mathematician  Lord  Rayleigh  (John  Strutt). 

Definition  9.41.  Let  S:U  U  be  a  self-adjoint  linear  operator  on  an  inner  product 
space.  The  Rayleigh  quotient  of  S  is  defined  as 


R[u 


u ,  S[u 


for 


u 


0  7^  u  £  U. 


(9.92) 


We  are,  in  fact,  primarily  interested  in  the  Rayleigh  quotient  of  positive  definite  op¬ 
erators,  for  which  R[u]  >  0  for  all  «  /  0.  If  S  =  L*  oL,  then,  using  (9.51),  we  can  rewrite 
the  Rayleigh  quotient  in  the  alternative  form 


R[u  = 


1 - 1 

53 

i _ i 

2 

U  IP 

2 

(9.93) 


keeping  in  mind  our  notational  convention  (9.50)  for  the  respective  norms  on  U  and  V. 

Theorem  9.42.  Let  S  he  a  self-adjoint  linear  operator.  Then  the  minimum  value  of 
its  Rayleigh  quotient, 

A^  =  min{  R[u]  \  u  0  }  ,  (9.94) 

is  the  smallest  eigenvalue  of  the  operator  S.  Moreover,  any  0  =/  v*  £  U  that  achieves  this 
minimum  value,  R[v+]  =  A*,  is  an  associated  eigenvector /eigenfunction:  S' [-a 

Proof :  Suppose  that  v*  £  U  is  a  minimizing  element,  and 


=  •V* 


A*  =  R[v+. 


V 


(9.95) 


the  minimum  value.  Given  any  u  £  U ,  define  the  scalar  function^ 


g(t)  =  R[v*  +tu 


(v+  +  tu,S[v* 

+  tu 

> 

\v*  +  tu 

2 

(v*,slv*. 

)  +  2 1  { u,S[v * 

)  +  t2  ( u  ,  S[u 

) 

2  +  2  t(u,vO 

+ 12  \\u  2 

where  we  used  the  self- adjoint  ness  of  S  and  the  fact  that  we  are  working  in  a  real  inner 
product  space  to  identify  the  terms 


u,S  [v*])  =  (S>],^)  =  (v+,S[u}}. 


Since 


g( 0)  =  R[v+]  <  R[v*  +  tu\=  g(t ), 

the  function  g(t)  will  attain  its  minimum  value  at  t  =  0.  Elementary  calculus  tells  us  that 

0  =  fl'(0)  =  2 


(u,S[v^ 

) 

V * 

2  -  K,s[iv 

){u,vO 

4 

t 


gift)  is  not  defined  if  v*  +  t  u 


0,  but  this  does  not  affect  the  argument. 
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Therefore,  using  (9.95)  to  replace  (v*  ,  S[v *] )  by  A 


v 


we  must  have 


«,<%*])  -  K(u’v, 


u ,  S[v*_ 


-  K  v* )  = 


(9.96) 


The  only  way  the  inner  product  in  (9.96)  can  vanish  for  all  possible  u  £  U  is  if 


S[V*]  =  KV*>  (9-97) 

which  means  that  0  ^  v*  is  an  eigenfunction  and  A^  its  associated  eigenvalue. 

On  the  other  hand,  if  v  is  any  eigenfunction,  so  S[v]  =  Xv,  where,  by  self-adjointness, 
the  eigenvalue  A  is  necessarily  real,  then  the  value  of  its  Rayleigh  quotient  is 


II  112 


(9.98) 


Since  A*  was,  by  definition,  the  smallest  possible  value  of  the  Rayleigh  quotient,  it  thus 
must  necessarily  be  the  smallest  eigenvalue.  Q.E.D. 


Remark :  The  existence  of  a  minimizing  function  is  not  addressed  in  this  result,  and, 
indeed,  there  may  be  no  minimum  eigenvalue;  the  inhmum  of  the  set  of  eigenvalues  could  be 
—  oo  or,  even  if  finite,  not  an  eigenvalue.  However,  for  the  positive  definite  boundary  value 
problems  considered  here,  the  eigenvalues  are  all  strictly  positive,  and  one  can,  with  some 
additional  analysis,  [44],  prove  the  existence  of  a  minimizing  eigenfunction,  and  hence  a 
smallest  positive  eigenvalue. 


We  label  the  eigenvalues  in  increasing  order,  so  that,  assuming  positive  definiteness, 
0<A1<A2<A3<  •••,  with  A-l  the  minimum  eigenvalue  and  hence  the  minimum  value 
of  the  Rayleigh  quotient.  To  characterize  the  other  eigenvalues,  we  need  to  restrict  the 
class  of  functions  over  which  one  minimizes.  Indeed,  since  the  nth  eigenfunction  vn  must 
be  orthogonal  to  all  its  predecessors  u1,...,un_1,  it  makes  sense  to  try  minimizing  the 
Rayleigh  quotient  over  such  elements. 


Theorem  9.43.  Let  v 


eigenvalues  0  <  Ax  < 
Let 


i , . . . 

<  v 


vn_1  be  eigenfunctions  corresponding  to  the  first  n  —  1 
i  of  the  positive  dehnite  self-adjoint  linear  operator  S. 


Un- 1  =  { 


u 


U  ,  V- 


( u 


">  Gl—  1 


0}  C  U 


(9.99) 


be  the  set  of  functions  that  are  orthogonal  to  the  indicated  eigenfunctions.  Then  the 
minimum  value  of  the  Rayleigh  quotient  function  restricted  to  the  subspace  Un_1  is  the 
nth  eigenvalue  of  S,  that  is, 


Xn  =  min  {  R[u] 


0  7^  u  E  Un_1  }  , 


and  any  minimizer  is  an  associated  eigenfunction  vn . 


(9.100) 


Proof :  We  follow  the  preceding  proof,  but  now  restrict  v *  and  u  to  belong  to  the  sub¬ 
space  Un_1.  Observe  that  S[u]  £  Un_1  whenever  u  £  Un_1,  because,  by  self- adjoint  ness, 


(S[u 


u 


sk] 


u 


>V3 


=  0 


for  j  =  1, 


Thus,  if  0  7^  v*  £  Un_1  minimizes  the  Rayleigh  quotient,  then  (9.96)  holds  for  arbitrary 
u  £  Un_1.  In  particular,  choosing  u  =  S[v*]  —  we  conclude  that  v *  satisfies  the 

eigenvalue  equation  (9.97),  and  hence  must  be  an  eigenfunction  that  is  orthogonal  to  the 
first  n  —  1  eigenfunctions.  This  means  that  A*  =  An  must  be  the  next-lowest  eigenvalue 
and  v *  =  vn  one  of  its  associated  eigenfunctions.  Q.E.D. 
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Example  9.44.  Return  to  the  Dirichlet  eigenvalue  problem  on  the  interval  [0,£]  for 
the  self-adjoint  (under  the  L2  inner  product)  differential  operator  —  D2  =  14*  o  D  discussed 
in  Example  9.36.  Its  Rayleigh  quotient  can  be  written  as 


R[u  = 


(u ,  —  u" 


u(x)  u"(x)  dx 


u 


u(x)2  dx 


u 


u(x)2  dx 


u 


where  the  second  expression,  based  on  the  alternative  form  (9.93),  can  be  readily  deduced 
from  the  first  via  an  integration  by  parts.  (Here,  both  domain  and  target  space  of  L  —  D 
use  the  same  L2  norm.)  According  to  Theorem  9.42,  the  minimum  value  of  R[u]  over 
all  nonzero  functions  u(x)  ^  0  satisfying  the  boundary  conditions  'u(O)  =  u(£)  =  0  is  the 
lowest  eigenvalue,  namely 


A1  = 


7T 


P 


min  \  R[u]  u( 0)  =  u(£)  =  0,  u(x)  ^  0 


which  is  achieved  if  and  only  if  u(x)  is  a  nonzero  constant  multiple  of  sin(7rx/£),  the 
corresponding  eigenfunction.  The  reader  is  invited  to  numerically  test  this  result  by  fixing 
a  value  of  £,  and  then  evaluating  R[u]  on  various  functions  u(x)  satisfying  the  boundary 
conditions  to  check  that  the  numerical  value  is  always  larger  than  tt2/T2  ,  the  smallest 
eigenvalue.  The  second  eigenvalue  can  be  found  by  minimizing  over  all  nonzero  functions 
that  are  orthogonal  to  the  first  eigenfunction: 


A2  — 


4tt 

~P 


=  mm 


R[u 


[*  7T  1 

'u(O)  =  u{£)  =  0,  J  u{x)  sin  —  xdx  —  0,  u{x)  ^  0  >  , 


and  similarly  for  the  higher  eigenvalues. 

Example  9.45.  Consider  the  Helmholtz  eigenvalue  problem  (9.91)  on  a  bounded  do¬ 
main  O  C  M2,  subject  to  Dirichlet  boundary  conditions.  The  associated  Rayleigh  quotient 
(9.93)  can  be  written  in  the  form 


Vu 


du 

dx 


+ 


du 

dy 


- 

u 

2 

J 

1 

dx  dy 


(9.101) 


u(x,  y )2  dx  dy 


Its  minimum  value  among  all  nonzero  functions  u(x,y)  ^  0  subject  to  the  boundary  condi¬ 
tions  u  =  0  on  is  the  smallest  eigenvalue  Ax,  and  the  minimizing  function  is  any  nonzero 
constant  multiple  of  the  associated  eigenfunction  v^x^y).  To  obtain  a  higher  eigenvalue 
An,  one  minimizes  R[u],  where  u(x,y)  ^  0  again  satisfies  the  boundary  conditions  and,  in 
addition,  is  orthogonal  to  the  preceding  n  —  1  eigenfunctions: 


0  =  ( u  ,  vk  )  =  /  /  u(x,  y)  vk(x:  y)  dx  dy ,  for  k  =  1, . . . ,  n  —  1. 

J  Jn 

It  can  be  proved,  [34,44],  that,  as  long  as  the  domain  is  bounded  with,  as  always,  a 
reasonably  nice  boundary,  there  is  a  solution  to  each  of  these  minimization  problems,  and 
hence  the  Helmholtz  equation  admits  an  infinite  sequence  of  positive  eigenvalues  0  <  Ax  < 
A2  <  A3  <  •  •  •  ,  with  An  oo  becoming  arbitrarily  large  as  n  — oo;  see  also  Theorem  9.47 
below. 
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Eigenfunction  Series 


For  our  applications  to  dynamical  partial  differential  equations,  we  will  be  particularly 
interested  in  expanding  more  general  functions  in  terms  of  the  orthogonal  eigenfunctions, 
the  simplest  case  being  the  classical  Fourier  series.  To  fix  notation,  we  will  proceed  as 
if  we  were  treating  a  one-dimensional  boundary  value  problem,  although  the  formulas 
are  equally  valid  for  higher-dimensional  problems,  e.g.,  those  governed  by  the  Helmholtz 
equation.  Thus,  we  consider  an  eigenvalue  problem  of  the  form  S[v]  =  Ar,  where  S  is 
a  positive  definite  or  semi-definite  operator  that  is  self-adjoint  relative  to  a  weighted  L2 
inner  product 

fh 

(v,v)=  /  v(x)  v(x)  p(x)  dx,  (9.102) 

J  a 

with  p(x)  >  0  on  the  bounded  interval  a  <  x  <  b. 

Let  0  <  A-l  <  A2  <  A3  <  •  •  •  be  the  eigenvalues,  and  u2,  u3, . . .  ,  the  corresponding 
eigenfunctions.  Theorem  9.33  assures  us  that  those  corresponding  to  different  eigenvalues 
are  mutually  orthogonal: 

(vj,vk}  =  0,  j^k.  (9.103) 

Orthogonality  is  not  automatic  if  v-  and  vk  belong  to  the  same  eigenvalue,  but  it  can  be 
ensured  by  selecting  an  orthogonal  basis  of  each  eigenspace  VXl  if  necessary  by  applying 
the  Gram-Schmidt  orthogonalization  process,  [89], 

Let  /  G  U  be  an  arbitrary  function  in  our  inner  product  space.  The  eigenfunction 
series  of  /  is,  by  definition,  its  generalized  Fourier  series: 


/  ~  Yi  CkVki 

k 


where  the  coefficient 


(9.104) 


is  found  by  formally  taking  the  inner  product  of  both  sides  of  (9.104)  with  the  eigenfunction 
vk  and  invoking  their  mutual  orthogonality.  (Note  that  our  earlier  eigenfunction  series 
formula  (3.108)  assumed  orthonormality;  here,  it  will  be  convenient  to  not  necessarily 


impose  the  condition 


v 


k 


=  1.)  For  example,  in  the  case  covered  by  Example  9.36. 


(9.104)  becomes  the  usual  Fourier  sine  series  for  the  function  /,  whereas  for  Example  9.37, 
it  represents  its  full  periodic  Fourier  series.  In  a  similar  fashion,  Example  9.40  leads 
to  series  in  the  eigenfunctions  of  the  Laplacian  operator  on  a  bounded  domain  subject 
to  appropriate  homogeneous  boundary  conditions;  explicit  examples  of  the  latter  can  be 
found  in  Chapters  11  and  12. 


As  we  learned  in  Section  3.5,  convergence  (in  norm)  of  the  series  (9.104)  requires  com¬ 
pleteness  of  the  eigenfunctions.  (Pointwise  and  uniform  convergence  are  then  implied  by 
more  restrictive  hypotheses  on  the  function  and  the  domain,  e.g.,  /  G  C1.)  In  the  ffiiite- 
dimensional  context,  when  *S:IRn  — >•  IRn  is  given  by  matrix  multiplication,  S[ u]  —  K u, 
there  are  only  finitely  many  eigenvectors,  and  so  the  summation  (9.104)  has  only  finitely 
many  terms.  There  are,  hence,  no  convergence  considerations,  and  completeness  is  au¬ 
tomatic.  For  boundary  value  problems  in  infinite-dimensional  function  space,  the  com¬ 
pleteness  of  the  resulting  eigensolutions  is  a  more  delicate  issue.  In  Example  9.36,  the 
eigenvalue  problem  for  S  =  —  D2  subject  to  homogeneous  Dirichlet  boundary  conditions 
on  a  bounded  interval  leads  to  the  Fourier  sine  eigenfunctions,  which  we  know  to  be  com¬ 
plete.  On  the  other  hand,  the  corresponding  eigenvalue  problem  on  the  real  line,  treated  in 
Example  9.38,  has  no  eigenfunctions,  and  so  completeness  is  out  of  the  question.  As  we  will 
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see,  the  eigenfunctions  associated  with  regular  boundary  value  problems  on  bounded  do¬ 
mains  are  automatically  complete,  whereas  singular  problems  and  problems  on  unbounded 
domains  require  additional  analysis. 

Whether  or  not  the  eigenfunctions  are  complete,  we  always  have  Bessel' s  inequality ^ 
(3.117): 


(9.105) 


k 


Theorem  3.43  says  that  the  eigenfunctions  are  complete  if  and  only  if  Bessel’s  inequality 
is  an  equality,  which  is  then  the  Plancherel  formula  for  the  eigenfunction  expansion. 


Green’s  Functions  and  Completeness 


We  now  combine  two  of  our  principal  themes.  Remarkably,  the  key  to  the  completeness 
of  eigenfunctions  for  boundary  value  problems  lies  in  the  eigenfunction  expansion  of  the 
Green’s  function!  Assume  that  S  is  both  self-adjoint  and  positive  definite.  Thus,  by 
Theorem  9.34,  all  its  eigenvalues  are  positive.  We  index  them  in  increasing  order: 


0  <  A-l  ^  A2  ^  Ag  ^  •  •  •  , 


(9.106) 


where  each  eigenvalue  is  repeated  according  to  its  multiplicity. 

By  positive  definiteness,  the  boundary  value  problem  S[u]  =  f  has  a  unique  solution. * 
Therefore,  it  admits  a  Green’s  function  G^(x)  =  G(x;  £),  which  satisfies  the  boundary  value 
problem 

S[G^]  =  5V  (9.107) 

with  a  delta  function  impulse  on  the  right-hand  side.  For  each  fixed  £,  let  us  write  down 
the  eigenfunction  series  (9.104)  for  the  Green’s  function: 


00 

G(x;0  =  Yi  Ck(Z)vk(i 0, 

k=  1 


where  the  coefficient 


(9.108) 


depends  on  the  impulse  point  £.  Since  S[vk 
evaluated  by  means  of  the  following  calculation: 


Xkvkl  the  coefficients  can  be  explicitly 


ck(0  II  vk 


( ,  A kvk  ) 

(S[Gz],vk) 


G^S[v 


k\ 


£)vk(x)p(x)  dx 


v*(0p(0> 


where  p{x)  is  the  weight  function  of  our  inner  product 
adjoint  ness  of  S.  Solving  for 


Vk(i)p(Q 

\  II  41  112 


(9.102),  and  we  invoked  the  self- 

(9.109) 


i  Formula  (3.117)  assumed  orthonormality  of  the  functions;  here  we  are  stating  the  analogous 
result  for  orthogonal  elements.  Moreover,  here,  the  eigenfunctions  and  hence  the  coefficients  ck 
are  all  real,  so  we  don’t  need  absolute  value  signs. 

As  usual,  we  are  assuming  existence  of  the  solution;  Proposition  9.19  guarantees  uniqueness. 
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and  then  substituting  back  into  (9.108),  we  deduce  the  explicit  eigenfunction  series 


oo 


G(x-,0  ~  E 


k=  1 


vk(x)vk(Qp(0 

V  II  vk  IP 


(9.110) 


for  the  Green’s  function.  Observe  that  this  expression  is  compatible  with  the  weighted 
symmetry  equation  (9.58). 

Example  9.46.  According  to  Example  6.9,  the  Green’s  function  for  the  L2  self- 
adjoint  boundary  value  problem 


—  u"  =  /(x),  n(0)  =  0  =  u(  1) 


is 


G(x-,0  = 


x(l  -£) 
qi  -x) 


X  < 

x  >  £■ 


On  the  other  hand,  the  eigenfunctions  for 


(9.111) 


—  v"  =  X  v . 


v(0)  =  0  =  v(l) 


are  vk(x)  =  sin/c7rx,  with  corresponding  eigenvalues  Xk  =  /c27r2,  for  k  =  1,  2,  3, ...  .  Since 


v 


k 


=  I  sin2  k  7TX  dx  =  h 
o 


1 

2  ’ 


formula  (9.110)  implies  the  eigenfunction  expansion 


oo 


g(X-,o  =  Y1 


k=  1 


2sin/c7rx  sin/c7r<^ 
k27T2 


(9.112) 


This  result  can  be  checked  by  a  direct  computation  of  the  Fourier  sine  series  of  (9.111). 

Let  us  now  apply  Bessel’s  inequality  (9.105)  to  the  eigenfunction  series  (9.108)  for  the 
Green’s  function;  using  (9.109),  the  result  is 


n 


n 


E  c^): 


v 


k 


k=  1 


vk(02p(02  /  r 
A2  II  v  II2  - 

k=  1  Ak  II  Vk  I 


E 


G{x\ £)2  p(x)  dx. 


(9.113) 


a 


We  divide  by  p(£)  >  0,  and  then  integrate  both  sides  of  the  resulting  inequality  from  a  to 
b.  On  the  left-hand  side,  the  integrated  summands  are 


a  A k  ||  Vk  II 


A 


k 


- - 2  [  vk(02 p(0  =  X 

Vk  Ja  K 


Substituting  back  into  (9.113)  establishes  the  interesting  inequality 


n  -|  nb  nb 

E  IT  s  /  / 

k  =  i  K’  J  a  J  a 


pjx) 

p(0 


dx  d £. 


(9.114) 


To  make  the  right-hand  side  look  less  strange,  we  can  replace  G(x;  £)  by  the  symmetric 
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modified  Green’s  function  G?(x;£)  =  £■?(#;  £)/p(£)  =  C?(£;x),  cf.  (9.59),  whence 


G{x\i)2p(x)p(C)dxdi  = 


(9.115) 


which  we  can  interpret  as  a  “double  weighted  L2  norm”  of  the  modified  Green’s  function 
C?(x;£).  Since  the  summands  in  (9.114)  are  all  positive,  we  can  let  n  oo,  and  conclude 
that 


E 


k  =  1 


< 


(9.116) 


Thus,  assuming  that  the  right-hand  side  of  this  inequality  is  finite,  the  summation  on  the 
left  converges.  This  implies  that  its  summands  must  go  to  zero:  0  as  k  oo.  We 

have  thus  proved  the  first  statement  of  the  following  important  result. 


Theorem  9.47.  If  ||  G  ||2  <  oo,  then  the  eigenvalues  of  the  positive  dehnite  self- 
adjoint  operator  S  are  unbounded :  0<A/c^ooas/c^oo.  Moreover ,  the  associated 
orthogonal  eigenfunctions  v1 ,  v2 ,  v3 , . . .  ,  are  complete. 


Proof :  Our  remaining  task  is  to  prove  completeness  —  that  is,  that  the  eigenfunction 
series  (9.104)  of  any  function  /  E  U  converges  in  norm.  For  n  —  2,3,4, ...  ,  consider  the 
function 

71—1 

9n—  1  f  ^  ^  GT/c’ 

k=  1 


i.e.,  the  difference  between  the  function  /  and  the  (n—  l)st  partial  sum  of  its  eigenfunction 
series.  Completeness  requires  that 


9n  —  1 


4  0 


as 


n  ^  oo, 


(9.117) 


We  can  assume  that  gn_i  ^  0,  since  otherwise,  the  eigenfunction  series  terminates,  with 
0  =  gn_i  —  gn—  gn+\  —  •  *  *  (why?),  and  so  (9.117)  holds  trivially. 

First,  note  that,  for  any  j  =  l,...,n— 1, 


(9n-l>vj  ) 


71—1 


/,«,•>-£  ck(vk,vj ) 
k  =  i 


by  the  orthogonality  of  the  eigenfunctions  combined  with  the  formula  (9.104)  for  the  co¬ 
efficient  c-.  Thus,  gn_i  E  V^_i,  the  subspace  (9.99)  of  functions  orthogonal  to  the  first 
n  —  1  eigenfunctions  used  in  the  Rayleigh  Minimization  Theorem  9.43.  Since,  according  to 
(9.100),  An  is  the  minimum  value  of  the  Rayleigh  quotient  among  all  nonzero  elements  of 
Vn_1j  we  must  have 


K  <  R[9n-1. 


1 
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and  hence 


A 


n 


9n  —  l 


< 


9n-l’Sl9n-l. 

n  —  1 

/  -  52  CkVk  >  S 


k=  1 
n  — 1 


n  — 1 


/- E 


fe  =  i 
n  — 1 


=  ( /  -  52  ckvk > ^[/] - 52  cfc-s'[ 


k  J 


fc=l 
n  —  1 


fc=l 
n  —  1 


/-  52  -  52  cfeA 


fc  =  i 


fc  =  i 


n—  1 


n  — 1 


n  — 1 


/>£[/]> -52  w./>fc>-52  cfc(^^[/])  +  E  A 


kck 


v 


k 


k=  1 
n  — 1 


fc=l 


fc=l 


/>£[/]>  - 52  Afennrfc'2 


fc=l 


In  the  hnal  equality,  we  used  the  self-adjointness  of  S'  to  identify 


(«k>S[/]>  =  (^K],/)  =  Xk(vkJ )  =  xk(f’vk)i 

coupled  with  the  formula  in  (9.104)  for  the  coefficients  ck.  Since  the  summands  in  the  hnal 
expression  are  all  positive,  we  conclude  that 

7  2  ^  (f,S[f]) 

9n- 1  —  \ 

^ n 

Since  we  already  know  that  An  — )►  oo,  the  right-hand  side  of  the  hnal  inequality  goes  to  0 
as  n  oo.  This  implies  (9.117)  and  hence  establishes  completeness.  Q.E.D. 

One  important  corollary  of  this  theorem  is  that,  since  each  eigenvalue  is  repeated 
according  to  its  geometric  multiplicity,  the  multiplicity  cannot  be  inhnite  (why?),  and 
hence  each  eigenspace  of  such  an  S  is  necessarily  hnite-dimensional. 


Example  9.48.  For  the  eigenvalue  problem  considered  in  Example  9.46,  since  p(x)  = 
1,  the  double  norm  of  the  (modihed)  Green’s  function  G(x ;£)  =  C?(x;£)  is 


G 


J  J  G(x;£)2  dxd£  =  2  J  J  x2(l  -  £)2  dx  d£  =  L 


<  OO. 


Thus,  Theorem  9.47  re-establishes  the  completeness  of  the  sine  eigenfunctions,  meaning 
that  the  eigenfunction  series,  which  is  just  the  ordinary  Fourier  sine  series  on  [0,1],  con¬ 
verges  in  norm. 


Indeed,  for  any  regular  Sturm-Liouville  boundary  value  problem  on  a  bounded  in¬ 
terval,  the  (modihed)  Green’s  function  is  automatically  continuous,  and  hence  its  double 
weighted  norm  is  finite.  Thus,  Theorem  9.47  implies  the  completeness  of  the  Sturm- 
Liouville  eigenfunctions.  In  Chapters  11  and  12,  we  will  extend  this  result  to  some  impor¬ 
tant  singular  boundary  value  problems. 
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Example  9.49.  The  completeness  result  of  Theorem  9.47  doesn’t  directly  apply  to 
the  periodic  boundary  value  problem  of  Example  9.37,  because  it  is  not  positive  definite, 
and  hence  there  is  no  Green’s  function.  However,  we  can  convert  it  into  a  positive  definite 
problem  by  a  simple  trick.  As  you  are  asked  to  prove  in  Exercise  9.4.4,  if  S'  >  0  is  any 
positive  semi-definite  operator  and  fi  >  0  any  positive  constant,  then  S  =  S  +  /rI  is  positive 
definite,  where  I[u]  =  u  is  the  identity  operator.  Thus,  we  replace  the  original  periodic 
boundary  value  problem  (9.88)  by  the  following  modification: 

—  vn  +  fiv  =  Au,  v{—  7r)  =  u(7r),  v\—  tt)  =  v' (tt)  .  (9.118) 

This  does  not  alter  the  eigenfunctions,  while  adding  p  to  each  of  the  eigenvalues,  and 
hence  the  modified  problem  has  eigenvalues  A0  =  /i,  with  eigenfunction  u0(x)  =  1,  and 
An  =  n2  +  /r,  with  two  independent  eigenfunctions:  vn(x)  =  cos nx  and  vn(x)  =  sinnx. 

The  Green’s  function  for  the  periodic  boundary  value  problem 


-Vn-\-llV  =  5(x  —  £),  v{—  7r)  =  v'(—  7f)  =  v' (tt) 


where  /j,  >  0  is  a  fixed  constant,  is  derived  along  the  same  lines  as  in  Example  6.10.  Setting 
(i  —  ce2,  the  result  is 


G(*;0 


coshce  ( 7T  —  |  x  —  £ 
2  uj  sinh  Tree 


(9.119) 


Its  double  L2  norm  is  clearly  finite,  and,  although  unnecessary,  can  even  be  computed: 


*7 r  rTT 


G 


G(x ;  £)2  dx  d £ 


7r  ( 2  Tree  +  sinh  2  ttuj  ) 


<  oo. 


—  7 r  J  —  TT 


4  c u3  sinh2  ncu 

As  a  result,  Theorem  9.47  reconfirms  the  completeness  of  the  trigonometric  eigenfunctions. 


Example  9.50.  According  to  (6.120),  the  Green’s  function  C?(x;£)  for  the  Dirichlet 
boundary  value  problem  for  the  Poisson  equation  on  a  domain  C  IR2  is  the  sum  of  a 
logarithmic  potential  (6.106)  and  a  harmonic  function.  Thus  G(x;£)2  a  sum  °f  three 
terms:  the  first  two,  involving  (logr)2  and  logr  with  r  —  ||  x  —  £  ||,  have  mild  singularities 
when  x  =  £,  while  the  last  term  is  smooth  (indeed  analytic)  everywhere.  Using  this 
information,  it  is  not  hard  to  prove  that  its  double  L2  norm 


G(x,  y;  rj)2  dx  dy 


d^  dr]  <  oo 


is  finite.  Indeed,  the  only  problematic  point  is  the  logarithmic  singularity  at  x  =  £,  but  a 
polar  coordinate  computation,  similar  to  that  used  in  the  proof  of  Theorem  6.17,  shows  that 
such  logarithmic  singularities  still  have  finite  integrals.  Therefore,  Theorem  9.47  implies 
that  the  Helmholtz  eigenvalues  An  — oo,  and  the  corresponding  Helmholtz  eigenfunctions 
vn(x,y)  form  a  complete  orthogonal  system. 


Remark :  In  problems  involving  unbounded  domains,  such  as  the  Schrodinger  equation 
for  the  hydrogen  atom  to  be  discussed  in  Section  12.7,  the  eigenfunctions  are  typically  not 
complete,  and  one  needs  to  introduce  additional  solutions  corresponding  to  what  is  known 
as  the  continuous  spectrum  of  the  operator.  Functions  are  now  represented  by  combinations 
of  discrete  Fourier-like  sums  over  the  eigenfunctions  (the  bound  states  in  the  quantum- 
mechanical  system)  plus  a  Fourier  integral-like  term  involving  the  continuous  spectrum 
(the  scattering  states),  [66,  72].  A  full  discussion  of  completeness  and  convergence  in  such 
cases  must  be  deferred  to  an  advanced  course  in  analysis,  [95]. 
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Exercises 


9.4.1.  Find  the  eigenvalues  and  an  orthonormal  eigenvector  basis  for  the  following  symmetric 
matrices: 


(a) 


2 

6 


6 

■7 


,  (*>) 


5  -2 

2  5 


>  (c) 


2  -1 
1  5 


>  (d) 


/ 1 

0 

q 

( 

6 

-4 

1\ 

0 

1 

3 

,  (e) 

-4 

6 

-1 

\4 

3 

1J 

\ 

1 

-1 

11/ 

9.4.2.  Determine  whether  the  following  symmetric  matrices  are  positive  definite  by  computing 
their  eigenvalues. 

(  1  -1  0\  (  4  -1  —2  \ 


(a) 


2 

-2 


2 

3 


(b) 


2  3 

3  6 


(c) 


-1  2  -1 

\  o  — 1  i) 


(< d ) 


-1  4  -1 

V-2  -1  4 y 


9.4.3.  Suppose  £[u]  =  iLu,  where  K 


0  1 


1  q  , .  (a)  Show  that  S:  IP  — )►  IP  is  positive  semi- 

definite  under  the  dot  product,  (b)  Find  the  eigenvalues  of  S.  (c)  Explain  why  your  result 
in  part  (b)  does  not  contradict  Theorem  9.34. 

0  9.4.4.  Suppose  that  S:U  —tU  is  a  positive  semi-definite  linear  operator.  Let  I :  U  —tU  be  the 
identity  operator,  so  I  [u]  =  u.  (a)  Prove  that,  for  any  positive  scalar  /i  >  0,  the  operator 
S'  =  S  +  /il  is  positive  definite,  (b)  Show  that  S  and  S  have  the  same  eigenfunctions. 
Do  they  have  the  same  eigenvalues?  If  not,  how  are  their  eigenvalues  related? 


9.4.5.  Find  the  minimum  value  of  R[v] 


ro 


v'2  dx 


ro 


v2  dx 


on  the  space  of  C2  functions  v{x)  defined 


on  0  <  x  <  1  that  are  subject  to  one  of  the  following  pairs  of  boundary  conditions: 

(a)  u(0)  =  v(l)  =  0,  (b)  u(0)  =  v  (1)  =  0,  (c)  v'(0)  =  v'(l)  =  0. 


2  12  7 

x  v  dx 


9.4.6.  Find  the  minimum  value  of  R[v]  = 


/; 


v2  dx 


on  the  space  of  C2  functions  defined  on 


[1,  e]  subject  to  the  boundary  conditions  u(l)  =  v{e)  =  0. 

9.4.7.  Show  that  the  Rayleigh  quotient  R[v]  has  the  same  value  for  all  nonzero  scalar  multiples 
of  an  element  0  /  v  G  b,  i.e.,  R[cv]  =  R[v]  for  all  c/  0. 

9.4.8.  Prove  that  the  minimum  value  of  the  Rayleigh  quotient  of  a  positive  semi-definite,  but 
not  positive  definite,  operator  is  0. 


T  9.4.9.  (a)  Find  the  eigenfunctions  and  eigenvalues  for  the  boundary  value  problem 

—  x2u"  —  xu  =  Aw,  u{  1)  =  u(e)  =  0. 

(b)  Under  which  inner  product  are  the  eigenfunctions  orthogonal?  Justify  your  answer  by 
direct  computation. 

(c)  Write  down  the  eigenfunction  expansion  of  a  function  f(x)  defined  for  1  <  x  <  e. 

(d)  Find  the  Green’s  function  for 

—  x2u"  —  xu  =  /(x),  u(  1)  =  u(e)  —  0, 

both  in  closed  form  and  as  a  series  in  the  eigenfunctions  you  found  in  part  (a). 

(e)  Is  your  Green’s  function  symmetric?  Discuss. 

(f)  Prove  the  completeness  of  the  eigenfunctions. 

9.4.10.  Discuss  completeness  of  the  eigenfunctions  of  the  boundary  value  problem 

—  x2u"  —  2 xu  =  Aw,  |  a(0)  |  <  oo,  u(  1)  =  0. 
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9.4.11.  Consider  the  eigenvalue  problem  —u"  =  Xu,  u{ 0)  =0,  u  ( 1)  =  0.  (a)  Is  the  problem 
self-adjoint?  positive  definite?  Which  inner  product  are  you  referring  to?  ( b )  Find  all 
eigenvalues  and  eigenfunctions,  (c)  Write  down  the  explicit  formula  for  the  eigenfunction 


expansion  of  a  function  f(x)  defined  on  [0. 
prove  completeness  of  the  eigenfunctions. 


1].  (d)  Find  the  Green’s  function  and  use  it  to 


C  9.4.12.  (a)  Find  the  eigenfunctions  and  eigenvalues  for  the  Chebyshev  boundary  value  problem 

(. x 2  —  1  )u"  +  xu  =  A  a,  u(— 1)  =  u(  1)  =  0. 

Hint:  Let  x  =  cos#.  ( b )  Under  what  inner  product  are  the  eigenfunctions  orthogonal? 
Justify  your  answer  by  direct  computation,  (c)  Find  the  Green’s  function  for 

( x 2  —  1  )u"  +  xu  =  /(#),  u{— 1)  =  u(  1)  =  0, 

both  in  closed  form  and  as  a  series  in  the  eigenfunctions  you  found  in  part  (a). 

(d)  Discuss  completeness  of  the  eigenfunctions. 

9.4.13.  Consider  the  differential  operator  S[u]  =  —  u"  +  u  on  the  space  of  C2  functions  u(x) 
defined  for  all  x  and  subject  to  the  boundary  conditions  lim  u{x)  =  lim  u{x)  =  0. 

x->oo  x  — »  —  oo 

(a)  Find  the  Green’s  function  G(x\f).  (b)  Compute  its  double  L2  norm:  ||  G  ||2.  What 
does  this  indicate  about  the  completeness  of  the  eigenfunctions  of  A?  (c)  Justify  your  con¬ 
clusion  in  part  (b)  by  determining  the  eigenfunctions. 

9.4.14.  Find  all  (real  and  complex)  eigenvalues  of  the  first-derivative  operator  D  =  d/dx  on 
the  interval  [0,  1]  subject  to  the  single  periodic  boundary  condition  u(0)  =  u(l).  Are  the 
corresponding  eigenfunctions  orthogonal?  For  which  inner  product? 


C  9.4.15.  Consider  the  Dirichlet  boundary  value  problem 

—  A u  =  h(x1y)1  u(x,  0)  =  0,  u(x,  1)  =  0,  tz(0,  t/)  =  0,  a(l,y)  =  0,  0<x,y<l, 

for  the  Poisson  equation  on  the  unit  square,  (a)  Find  the  eigenfunction  series  expansion 
for  the  Green’s  function  of  this  problem,  (b)  Does  your  series  coincide  with  that  derived 
in  Exercise  6.3.22?  Explain  any  discrepancies,  (c)  For  the  impulse  points  (£,77)  =  (.5,  .5) 
and  (.7,  .8),  graph  the  result  of  summing  the  first  9,  25,  and  100  terms  in  your  series,  and 
discuss  what  you  observe  in  light  of  what  you  expect  the  Green’s  function  to  look  like. 


9.4.16.  Find  the  eigenfunction  series  expansion  for  the  Green’s  function  of  the  following  mixed 
boundary  value  problems: 

(a)  —  A u  =  h(x,y),  u(x,  0)  =  0,  u(x,  1)  =  0,  iqc(0,y)  =  0,  iqc(l,y)  =  0,  0<x,y<l; 

(b)  -  A u  =  h(x,y)1  u(x,  0)  =  0,  uy(x,  1)  =  0,  a(0,  y)  =  0,  ax(l,y)  =  0,  0<x,y<l. 

9.4.17.  Find  the  eigenfunction  series  expansion  for  the  Green’s  function  of  the  following 
Helmholtz  boundary  value  problem: 

—  A u  +  u  =  h(x,  y),  u(x,  0)  =  u(x,  7 r)  =  a(0,  y)  =  u{ 7r,  y)  =  0,  0  <  x,  y  <  7r. 


9.4.18.  If  the  eigenvalues  of  a  self-adjoint  linear  operator  satisfy  An  — >>  00  as  n  00,  explain 
why  each  eigenspace  is  necessarily  finite-dimensional. 

9.4.19.  True  or  false:  If  S:  Rn  — >•  Rn  is  any  linear  function,  then  one  can  find  an  inner  product 
on  Rn  that  makes  S  self-adjoint. 


9.5  A  General  Framework  for  Dynamics 

In  this  final  section,  we  show  how  to  use  general  eigenfunction  expansions  to  analyze 
three  important  classes  of  linear  dynamical  systems:  parabolic  diffusion  equations  such 
as  the  heat  equation,  hyperbolic  vibration  equations  such  as  the  wave  equation,  and  the 
Schrodinger  equation,  a  complex  evolution  equation  that  governs  the  dynamical  processes 
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of  quantum  mechanics.  In  all  three  cases  we  can,  assuming  completeness,  write  the  general 
solution  to  the  initial-boundary  value  problem  as  a  convergent  eigenfunction  series  with 
time-dependent  coefficients,  and  thereby  establish  several  general  properties  governing  their 
dynamics. 


Evolution  Equations 


In  all  cases,  our  starting  point  is  the  basic  equilibrium  equation ,  which  is  a  linear  system 
of  the  form 


(9.120) 


where  /  represents  an  external  forcing.  The  linear  operator  S  is  assumed  to  be  of  the  usual 
self-adjoint  form 

S  =  L*oL ,  (9.121) 

which  is  either  positive  definite,  when  kerL  =  {0},  or  positive  semi-definite,  the  latter 
case  being  characterized  by  the  existence  of  null  eigenfunctions  0  7^  v  G  ker  L  =  ker  S.  In 
finite  dimensions,  (9.120)  represents  a  linear  algebraic  system  consisting  of  n  equations  in  n 
unknowns  with  positive  (semi-) definite  coefficient  matrix.  In  infinite-dimensional  function 
space,  it  represents  a  self-adjoint  positive  (semi-) definite  boundary  value  problem  for  the 
unknown  function  u. 

With  the  equilibrium  operator  in  hand,  there  are  two  principal  classical  dynamical 
systems  of  importance  as  physical  models.  The  first  are  the  (unforced)  diffusion  processes 
modeled  by  an  evolution  equation  of  the  form 


du 

dt 


=  —  L*  o  L[u  . 


(9.122) 


In  the  discrete  case,  this  represents  a  first-order  system  of  ordinary  differential  equations, 
known  as  a  linear  gradient  flow .  In  the  continuous  case,  S  is  a  linear  differential  operator 
equipped  with  homogeneous  boundary  conditions,  and  (9.122)  represents  a  linear  partial 
differential  equation  for  the  time- varying  function  u  =  n(t,x),  the  heat  equation  being 
the  prototypical  example.  (As  in  the  preceding  section,  the  notation  employed  below 
indicates  that  we  are  working  in  a  single  space  dimension,  but  the  methods  and  results 
apply  equally  well  to  higher-dimensional  problems.)  The  addition  of  external  forcing  to 
the  diffusion  process  is  treated  in  Exercise  9.5.6. 

The  basic  separation  of  variables  solution  technique  was  already  outlined  in  Section  3.1. 
To  recap,  the  separable  solutions  are  of  exponential  form 


u(t,  x)  =  e  xt  v(x) 


(9.123) 


where  v  G  U  is  a  fixed  function.  Since  the  operator  S  is  linear  and  does  not  involve  t 
differentiation,  we  find 


du 

dt 


=  —  A  e 


xt 


v. 


while 


S[u ] 


Xt 


SM 


Substituting  back  into  (9.122)  and  canceling  the  common  exponential  factors,  we  are  led 
to  the  eigenvalue  problem 


S[v]  =  A  v. 


(9.124) 
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Thus,  (9.123)  defines  a  solution  if  and  only  if  v  is  an  eigenfunction  for  the  linear  operator 
S',  with  A  the  corresponding  eigenvalue. 

We  let  vk(x),  k  =  1,2,  ...  ,  be  the  orthogonal  eigenfunctions  and  0  <  Ax  <  A2  < 
A3  <  •  •  •  — oo  the  corresponding  eigenvalues.  Assuming  completeness,  the  solution  to 

the  initial  value  problem 

u{0,x)  =  f(x)  (9.125) 

can  be  expanded  in  terms  of  the  eigensolutions: 


oo 

u(t,x)  =  Y  e~Xktckvk(x), 
k=  1 


where 


(9.126) 


are  the  eigenfunction  coefficients  of  the  initial  data.  In  particular,  the  fundamental  solution 
of  the  diffusion  equation  is  defined  as  the  solution  u  =  F{t,  x\  £)  to  the  initial  value  problem 


'u(f),  X )  =  <5g(x) 


(9.127) 


induced  by  an  initial  delta  impulse  at  the  point  £.  Its  eigenfunction  coefficients  are 


ck  = 


6^vk 


v 


k 


V 


k 


/  5 [x  —  £)  vk{x)  p{x)  dx  = 

J  a 


vk  (0p(0 


V 


k 


Thus. 


oo 


F(t,x ;£)  =  Y 


-A kt  VkinviMIMl 


k=  1 


V 


(9.128) 


k 


where  the  denominator  denotes  the  appropriately  weighted  L2  norm  of  the  eigenfunction: 


v 


k 


fb 

=  /  vk{x)2  p{x)  dx. 

J  a 


As  with  the  one-dimensional  heat  equation,  if  the  equilibrium  operator  is  positive  def¬ 
inite,  S  >  0,  then  all  the  eigenvalues  are  strictly  positive,  and  hence,  generically,  solutions 
decay  to  0  at  the  exponential  rate  prescribed  by  the  smallest  eigenvalue,  which  can  be 
characterized  as  the  minimum  value  of  the  Rayleigh  quotient.  On  the  other  hand,  if  S 
is  only  positive  semi-definite,  then  the  solution  will  tend  to  a  null  eigenmode,  that  is,  an 
element  of  ker*S  =  kerL,  as  its  asymptotic  equilibrium  state.  If  dimkerS"  =  p,  the  first  p 
eigenvalues  are  all  0  =  Ax  =  •  •  •  =  A  <  A  +1,  and  the  solution 

p 

ckvk{x )  as  t  — >  oo 

k=  1 

will  tend  to  its  eventual  equilibrium  configuration  at  an  exponential  rate  determined  by 
the  smallest  positive  eigenvalue  Ap+1  >  0.  In  almost  all  applications,  p  —  1  and  there  is  a 
single,  constant  null  eigenfunction.  The  Neumann  and  periodic  boundary  value  problems 
for  the  heat  equation  are  prototypical  examples. 
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Exercises 


9.5.1.  Find  the  eigenfunction  series  of  the  fundamental  solution  for  the  heat  equation 

ut  =  7  uxx  on  the  interval  0  <  x  <  1  subject  to  homogeneous  Dirichlet  boundary  conditions. 

9.5.2.  Solve  Exercise  9.5.1  for  (a)  the  mixed  boundary  conditions  u(t,  0)  =  ux(t ,  1)  =  0; 

(b)  homogeneous  Neumann  boundary  conditions. 


9.5.3.  Let  D[u]  =  u  be  the  derivative  operator  acting  on  the  vector  space  of  C1  scalar  func¬ 
tions  u(x)  defined  for  0  <  x  <  1  and  satisfying  the  boundary  conditions  a(0)  =  u  (1)  =  0. 
(a)  Given  the  L2  inner  product  on  its  domain  space  and  the  weighted  inner  product 


( v  ,  v  )  =  v(x)  v(x) 


x  dx  on  its  target  space,  determine  the  adjoint  operator  D* . 


(b)  Let  S  =  D*  o  D.  Write  out  the  diffusion  equation  ut  =  —  S[u]  explicitly,  as  a  partial 
differential  equation  plus  boundary  conditions. 

(c)  Given  the  initial  condition  u( 0,  x)  =  x  —  x2,  what  is  the  asymptotic  equilibrium 

u*(x)  =  lim  u(t,x)  of  the  resulting  solution  to  the  diffusion  equation? 

t  — >  oo 


9.5.4.  Write  down  an  eigenfunction  series  for  the  solution  u(t,  x)  to  the  initial  value  problem 
a(0,  x)  =  f(x)  for  the  fourth-order  evolution  equation  ut  =  ~uxxxx  subject  to  the  boundary 
conditions  u(t,  0)  =  uxx(t,  0)  =  u(t,  1)  =  uxx(t,  1)  =  0.  Does  your  solution  tend  to  an 
equilibrium  state?  If  so,  at  what  rate? 


9.5.5.  Answer  Exercise  9.5.4  for  the  boundary  conditions 

Ux{tj  0)  ^XXX^I  1)  ^XXX^I  1) 

^  9.5.6.  Explain  how  to  solve  the  forced  diffusion  equation  ut  =  —  S[u]  +  /,  subject  to  homoge¬ 
neous  boundary  conditions,  when  f(x)  does  not  depend  on  time  t.  Does  the  solution  tend 
to  equilibrium  as  t  oo?  If  so,  what  is  the  rate  of  decay,  and  what  is  the  equilibrium? 


9.5.7.  Show  that  if  u(t,x)  solves  the  diffusion  equation  (9.122),  then 
whenever  t  <  s. 


u(t ,  • )  ||  >  ||  u(s,  • ) 


^  9.5.8.  Let  S  >  0  be  a  positive  definite  operator.  Suppose  F(t,x;£)  is  the  fundamental 

r  OO 

solution  for  the  diffusion  equation  (9.122).  Prove  that  G(x;£)  =  J  F(t,x ;^)  dt  is 
the  Green’s  function  for  the  corresponding  equilibrium  equation  S[u]  =  /. 


Vibration  Equations 


The  second  important  class  of  dynamical 
vibration  equations 

d2u 
dt 2 


systems  comprises  the  second-order  (in  time) 


(9.129) 


which  we  initially  analyze  in  the  absence  of  external  forcing.  Vibrational  systems  arise  as 
a  consequence  of  Newton’s  equations  of  motion  in  the  absence  of  frictional  forces.  Their 
continuum  versions  model  the  propagation  of  waves  in  solids  and  fluids,  electromagnetic 
waves,  plasma  waves,  and  many  other  related  physical  systems. 

For  a  general  vibration  equation,  the  separable  solutions  are  of  trigonometric  form 


u(t,  x )  =  cos(c ut)  v(x) 


sin(c at)  v{x). 


(9.130) 


or 
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Substituting  either  ansatz  back  into  (9.129)  results  in  the  same  eigenvalue  problem  (9.124) 
for  v(x)  with  eigenvalue  A  =  uj2  equal  to  the  square  of  the  vibrational  frequency.  We 
conclude  that  the  normal  modes  or  eigensolutions  take  the  form 


uk  (£,  x)  =  cos [ujk  t )  vk  (x) ,  uk  (£,  x)  =  sin {ujk  t )  vk  (x) , 

provided  Xk  =  uok  >  0  is  a  nonzero  eigenvalue  and  vk  an  associated  eigenfunction.  Thus,  the 
natural  vibrational  frequencies  of  the  system  are  the  square  roots  of  the  nonzero  eigenvalues, 
a  fact  that  we  already  observed  in  the  context  of  the  one-dimensional  wave  equation. 

In  the  positive  definite  case,  the  eigenvalues  are  all  strictly  positive,  and  so  the  general 
solution  is  built  up  as  a  linear  combination  of  vibrational  eigenmodes: 


oo 


u{t,x)  =  Y  [ckuk(t,x)  +  dkuk(t,x) 
k=  1 


oo 


oo 


(9.131) 


Y  [cfccos(w/c*) +4sinKG  wfc(s)  =  Y  rkc°s(u;kt+dk)vk > 

k = 1  k  —  1 


where 


r 


—  \  ck  +  dl  i 


k  \/  k  1  k  ’  k 


5u  =  tan 


i d 


k 


c 


k 


The  initial  conditions 


oo 


oo 


k=  1 


k  —  1 


are  used  to  specify  the  coefficients: 


ck  = 


9  ?  vk 


v 


k 


4  = 


h ,  u 


k 


Uk  II 


2  • 


(9.132) 


g(x)  =  u(0,x)  =  ckvk(x)i  h(x)  =  ut(0,x)  =  Y  dk^kVkix),  (9.133) 


(9.134) 


In  the  unstable,  positive  semi-dehnite  cases,  any  null  eigenfunction  v0  E  kerS"  =  kerL 
contributes  two  aperiodic  eigensolutions: 


uQ(t,x)=v  0(x),  u0(t,  x)  =  t  Vq{x), 

as  can  be  readily  checked.  The  first  is  constant  in  time,  while  the  second  is  an  unstable, 
linearly  growing  mode,  which  is  excited  if  and  only  if  the  initial  velocity  is  not  orthogonal 
to  the  null  eigenfunction:  (  h  ,  u0  )  7^  0. 

If,  as  occurred  in  the  one-dimensional  wave  equation,  the  natural  frequencies  happen 
to  be  integer  multiples  of  a  common  frequency,  uok  =  nkuo *  for  nk  E  N,  then  the  solution 
(9.131)  is  a  periodic  function  of  t  with  period  p *  =  On  the  other  hand,  in  most 

cases  the  frequencies  are  not  rationally  related,  and  the  solution  is  only  quasiperiodic. 
Although  it  is  the  sum  of  individually  periodic  modes,  it  is  not  periodic,  and  never  exactly 
reproduces  its  initial  behavior;  see  the  illustrative  Example  2.20  for  additional  details. 


Forcing  and  Resonance 

Periodically  forcing  an  undamped  mechanical  structure,  modeled  by  a  vibrational  system 
of  ordinary  differential  equations,  at  a  frequency  that  is  distinct  from  its  natural  vibrational 
frequencies,  leads,  in  general,  to  a  quasiperiodic  response.  The  solution  is  a  sum  of  the 


390 


9  A  General  Framework  for  Linear  Partial  Differential  Equations 


unforced  vibrational  modes  superimposed  with  an  additional  component  that  vibrates  at 
the  forcing  frequency.  However,  if  forced  at  one  of  its  natural  frequencies,  the  system  may 
experience  a  catastrophic  resonance.  See  [89;  §9.6]  for  details. 

The  same  type  of  quasiperiodic/resonant  response  is  also  observed  in  the  partial  dif¬ 
ferential  equations  governing  the  vibrations  of  continuous  media.  Consider  the  forced 
vibrational  equation 


d2u 
dt 2 


S'f'u]  +  F(£,  x) 


(9.135) 


subject  to  specified  homogeneous  boundary  conditions.  The  external  forcing  function 
F(t,x)  may  depend  on  both  time  t  and  position  x.  We  will  be  particularly  interested 
in  a  periodically  varying  external  force  of  the  form 


F(t,x)  =  cos(cct)  h(x),  (9.136) 

where  uj  is  the  forcing  frequency,  while  the  forcing  profile  h(x)  is  unvarying. 

As  always,  the  solution  to  an  inhomogeneous  linear  equation  can  be  written  as  a 
combination, 

u(t,  x)  =  n*(£,  x)  +  z(£,  x),  (9.137) 

of  a  particular  solution  u*(t,x)  to  the  inhomogeneous  forced  equation  combined  with  the 
general  solution  z(t,x)  to  the  homogeneous  equation,  namely 


d2z 
dt 2 


(9.138) 


The  boundary  and  initial  conditions  will  serve  to  uniquely  prescribe  the  solution  n(t,x), 
but  there  is  some  flexibility  in  its  two  constituents  (9.137).  For  instance,  we  may  ask 
that  the  particular  solution  u *  satisfy  the  homogeneous  boundary  conditions  along  with 
zero  (homogeneous)  initial  conditions  and  thus  represent  the  pure  response  of  the  system 
to  the  forcing.  The  homogeneous  solution  z(t,x)  will  then  reflect  the  effect  of  the  initial 
and  boundary  conditions  unadulterated  by  the  external  forcing.  The  final  solution  is  the 
combined  sum  of  the  two  individual  responses. 

In  the  case  of  periodic  forcing  (9.136),  we  look  for  a  particular  solution 


u*{t,x)  =  cos(cct)  v*(x)  (9.139) 

that  vibrates  at  the  forcing  frequency.  Substituting  the  ansatz  (9.139)  into  the  equa¬ 
tion  (9.135)  and  canceling  the  common  cosine  factors,  we  discover  that  v+(x)  must  satisfy 
the  boundary  value  problem  prescribed  by  a  forced  differential  equation 

S[v J  -  uj2  v*  =  h(x),  (9.140) 

supplemented  by  the  relevant  homogeneous  boundary  conditions:  Dirichlet,  Neumann, 
mixed,  or  periodic. 

At  this  juncture,  there  are  two  possibilities.  If  the  unforced  homogeneous  boundary 
value  problem 

S[v]  —  uj2  v  =  0  (9.141) 

has  only  the  trivial  solution  v  =  0,  then,  according  to  the  Fredholm  Alternative  Theo¬ 
rem  9.10,  a  solution  to  the  forced  boundary  value  problem  will  exist  ^  for  any  form  of  the 


^  Existence  is  immediate  in  finite-dimensional  systems, 
relies  on  an  analytic  existence  theorem,  e.g.,  Theorem  9.28. 


For  boundary  value  problems,  this 
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forcing  function  h(x).  In  other  words,  if  uj2  is  not  an  eigenvalue,  then  the  particular  so¬ 
lution  (9.139)  will  vibrate  with  the  forcing  frequency,  and  the  general  solution  will  be  a 
periodic  or  quasiperiodic  combination  (9.137)  of  the  natural  vibrational  modes  along  with 
the  vibrational  response  to  the  periodic  forcing. 

On  the  other  hand,  if  uj2  =  Xk  is  an  eigenvalue,  and  so  uj  =  ujk  coincides  with  one  of  the 
natural  vibrational  frequencies  of  the  homogeneous  problem,  then  (9.141)  admits  nontrivial 
solutions,  namely  the  eigenfunction'*'  vk(x).  In  such  cases,  the  Fredholm  Alternative  tells 
us  that  the  boundary  value  problem  (9.140)  admits  a  solution  if  and  only  if  the  forcing 
function  is  orthogonal  to  the  eigenfunction: 

(h,vk)=  0.  (9.142) 

If  this  holds,  then  the  resulting  particular  solution  (9.139)  still  vibrates  with  the  forcing 
frequency,  and  resonance  doesn’t  occur. 

If  we  force  in  a  resonant  manner  —  meaning  that  the  Fredholm  condition  (9.142)  does 
not  hold  —  then  the  solution  will  be  a  resonantly  growing  vibration  of  the  form 


u* (£,  x)  =  at  sin(ccfc  t)  vk  {x)  +  cos(c ok  t)  v* (x) , 
in  which  a  is  a  constant  to  be  specified  as  follows.  By  direct  calculation, 


(9.143) 


atsm(ukt)  (S[vk]  -  co2kvk(x)) 

+  cos  (ojkt)  (S[u*]  -  u>2kv*(x)  +  2aujkvk(x)). 


The  first  term  vanishes,  since  vk(x)  is  an  eigenfunction  with  eigenvalue  Xk  =  uk.  Therefore, 
(9.143)  satisfies  the  forced  boundary  value  problem  if  and  only  if  v^(x)  satisfies  the  forced 
boundary  value  problem 


Uk  V+{x) 


h(x)  —  2 aujk  vk(x) 


(9.144) 


Again,  the  Fredholm  Alternative  implies  that  (9.144)  admits  a  solution  v*(x)  if  and  only  if 


0=  ( h  —  2auikvk  ,  vk  )  =  {h,vk)-2a 


V 


k 


and  hence  a  =  -  k 


v 


k 


(9.145) 


which  serves  to  fix  the  value  of  the  constant  in  the  resonant  solution  ansatz  (9.143). 
In  a  real-world  situation,  such  large  resonant  (or  even  near  resonant)  vibrations  will,  if 
unchecked,  eventually  either  leads  to  a  catastrophic  breakdown  of  the  system  or  to  a  tran¬ 
sition  into  the  nonlinear  regime. 


Example  9.51.  As  a  specific  example,  consider  the  initial-boundary  value  problem 
modeling  the  forced  vibrations  of  a  uniform  string  of  unit  length  that  is  fixed  at  both  ends: 

utt  =  c2uxx  +  cos(cct)  h(x)1 

u(t,  0)  =  0  =  u(t,  1),  n(0,x)  =  /(x),  ut (0,  x)  =  g(x). 


t  For  simplicity,  we  assume  that  the  eigenvalue  Xk  is  simple,  and  so  there  is  a  unique,  up  to 
constant  multiple,  eigenfunction  vk.  Modifications  for  multiple  eigenvalues  proceed  analogously. 
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The  particular  solution  u*(£,  x)  will  have  the  nonresonant  form  (9.139),  provided  there 
exists  a  solution  v*(x)  to  the  boundary  value  problem 


2  2  //  2 

—  UJ V*  —  —c  v"  —  L U  V * 


h(x ) 


«*(0)  =  0  =  «*(!) 


(9.147) 


The  natural  frequencies  and  associated  eigenfunctions  of  the  unforced  Dirichlet  boundary 
value  problem  are 


uuk  —  kcn^  vk(x)  =  sin  knx,  k  =  1,2,3,.... 

Thus,  the  boundary  value  problem  (9.147)  will  admit  a  solution,  and  hence  the  forcing  is 
not  resonant,  if  either  uj  ^  ujk  is  not  a  natural  frequency  or  uj  =  ujk  for  some  k  but  the 
forcing  profile  is  orthogonal  to  the  associated  eigenfunction: 

0  =  (h,vk)  =  /  h(x)  sin  knx  dx.  (9.148) 

Jo 

Otherwise,  the  system  will  undergo  a  resonant  response. 

For  example,  under  periodic  forcing  of  frequency  uj  with  trigonometric  sine  profile 
h{x)  =  sinknx,  for  k  a  positive  integer,  the  particular  solution  to  (9.147)  is 


sinknx  cos  out  sinknx 

=  - ,  O  9  O  5  S0  that  UAt,  X)  =  - - .  9  9  9  , 

uj2  -  k2n2c2  uj2  -  k 2  7 r2  c 2 

which  is  valid  provided  uj  ^  ujk  =  k  nc.  Observe  that  we  may  allow  the  forcing  frequency 
to  coincide  with  any  of  the  other  natural  frequencies,  uj  =  uju  for  n  ^  fc,  because  the  sine 
profiles  are  mutually  orthogonal,  and  so  the  nonresonance  condition  (9.148)  holds.  On  the 
other  hand,  if  uo  =  uok  =  /c7rc,  then  the  particular  solution 

t  sinknct  sinknx 
2kirc 

is  resonant  and  grows  linearly  in  time. 

To  obtain  the  full  solution  to  the  initial-boundary  value  problem,  we  write  u  =  u*  -j-z, 
where  z(t,x)  must  satisfy 


ujt,x) 


(9.150) 


ztt  -  CT*  =  °>  0)  =  0  =  z(t,  1), 

along  with  the  modihed  initial  conditions 


z(0,x)  =  f(x) 


sinknx 


uj 2  —  k2  n2  c 2 


dz 

dt 


(0,  x)  =  g(x), 


stemming  from  the  fact  that  the  particular  solution  (9.149)  has  a  nontrivial  initial  displace¬ 
ment.  (In  the  resonant  case  (9.150),  there  is  no  extra  term  in  the  initial  data.)  Note  that 
the  closer  uj  is  to  the  resonant  frequency,  the  larger  the  modification  of  the  initial  data, 
and  hence  the  larger  the  response  of  the  system  to  the  periodic  forcing.  As  before,  the 
solution  z(£,  x)  to  the  homogeneous  equation  can  be  written  as  a  Fourier  sine  series  (4.68). 
The  final  formulas  are  left  for  the  reader  to  write  out  in  detail;  see  Exercise  9.5.14. 
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Exercises 


9.5.9.  Which  of  the  following  forcing  functions  F(t,x)  excites  resonance  in  the  wave  equation 
utt  =  uxx  +  F(t,x)  when  subject  to  homogeneous  Dirichlet  boundary  conditions  on  the 

interval  0  <  x  <  1?  (a)  sin3£,  (b)  sin37r£,  (c)  sin|7rt,  (d)  sin7rt  sin7rx, 

(e)  sin7rt  sin27rx,  (f)  sin27rt  cos7rx,  (g)  x  (1  —  x)  sin  2irt. 

9.5.10.  Answer  Exercise  9.5.9  when  the  solution  is  subject  to  the  mixed  boundary  conditions 
u(t,  0)  =  ux(t,  1)  =  0. 

T  9.5.11.  Let  u)  >  0.  Find  the  solution  to  the  initial-boundary  value  problem 

utt  =  uxx  +  coscat,  u(t,  0)  =  0  =  u(t,  1),  a(0,  x)  =  0  =  ut( 0,  x). 

9.5.12.  Answer  Exercise  9.5.11  for  homogeneous  Neumann  boundary  conditions. 

9.5.13.  A  piano  wire  of  length  1  m  and  wave  speed  c  —  2  m/sec  can  support  a  maximal 
deflection  of  5  cm  before  breaking.  Suppose  the  wire  starts  at  rest,  with  both  ends  fixed, 
and  then  is  subject  to  a  uniform  periodic  force  F(t,  x)  =  ^  cos  cat  sin7rx.  What  range  of 
frequencies  will  cause  the  wire  to  break? 


0  9.5.14.  Write  out  the  eigenfunction  series  solution  to  the  initial-boundary  value  problem  in 
Example  9.51  with  h(x)  =  smkTrx. 

9.5.15.  How  should  the  solution  formulas  (9.131,  134)  be  modified  when  there  are  unstable 
modes?  Write  down  explicit  conditions  on  the  initial  data  that  prevent  an  instability  from 
being  excited. 


^  9.5.16.  Explain  how  to  convert  the  homogeneous  wave  equation  with  inhomogeneous  Dirichlet 
boundary  conditions  u(t,  0)  =  a(t),  u(t,£)  =  /?(£),  into  a  homogeneous  boundary  value 
problem  for  the  forced  wave  equation.  Hint :  Mimic  (4.46). 


T  9.5.17.  Two  children  hold  a  jump  rope  taut,  while  one  of  them  periodically  shakes  their  end  of 
the  rope.  Use  the  inhomogeneous  boundary  value  problem 

u  <92  u 

In?  =  ~dx 2’  «(i,0)  =  0,  u(t,  1)  =  sinojt, 

to  model  the  motion  of  the  rope,  adopting  units  in  which  the  wave  speed  c—  1. 

(a)  What  are  the  resonant  frequencies  of  this  system? 

(b)  Apply  the  method  of  Exercise  9.5.16  to  find  a  particular  solution  to  the  boundary  value 
problem  when  cv  is  a  nonresonant  frequency. 

(c)  Suppose  the  rope  starts  at  rest.  Find  a  series  solution  to  the  corresponding  initial¬ 
boundary  value  problem  when  ca  is  a  nonresonant  frequency. 

(d)  Answer  parts  (b,c)  when  ce  is  a  resonant  frequency.  Hint :  Use  the  ansatz  (9.143). 


9.5.18.  Explain  how  to  solve  the  periodically  forced  telegrapher’s  equation 

utt  +  aut  =  c2  uxx  +  h(x)  cosc at 

on  the  interval  0  <  x  <  1  when  subject  to  homogeneous  Dirichlet  boundary  conditions. 

At  which  frequencies  does  the  forcing  function  excite  a  resonant  response? 

Hint :  First  solve  Exercise  4.2.9. 

o 

9.5.19.  The  fourth-order  evolution  equation  utt  —  —c  uxxxx ,  subject  to  the  boundary  condi¬ 
tions  u(t,  0)  =  uxx{t ,  0)  =  u(t,  1)  =  uxx(t ,  1)  =  0,  models  the  transverse  vibrations  of  a 
simply  supported  uniform  thin  elastic  beam,  in  which  c  >  0  represents  the  wave  speed. 
Write  down  an  eigenfunction  series  for  the  solution  to  the  initial  value  problem 

u(0,x)  =  f(x),  ut( 0,x)=0.  Is  the  solution 

(i)  periodic,  (ii)  quasiperiodic,  (in)  chaotic,  (iv)  none  of  the  above? 
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The  Schrodinger  Equation 


The  fundamental  dynamical  system  that  governs  all  quantum-mechanical  systems  is  known 
as  the  Schrodinger  equation ,  first  written  down  by  the  great  twentieth-century  German 
physicist  Erwin  Schrodinger,  one  of  the  preeminent  founders  of  modern  quantum  physics. 
His  original  series  of  papers  in  which,  by  fits  and  starts,  he  arrives  at  his  fundamental 
equation,  makes  for  fascinating  reading,  [101  . 

Unlike  classical  mechanics,  quantum  mechanics  is  a  completely  linear  theory,  gov¬ 
erned  by  linear  systems  of  partial  differential  equations.  The  abstract  form  of  the  linear 
Schrodinger  equation  is 

(9.151) 


where  S'  is  a  linear  operator  of  the  usual  self-adjoint  form  (9.121).  In  this  equation, 
i  =  a/— T,  while  h  is  Planck’s  constant  (7.69).  The  operator  S  is  known  as  the  Hamil¬ 
tonian  for  the  quantum-mechanical  system,  and,  typically,  represents  the  quantum  energy 
operator.  For  physical  systems  such  as  atoms  and  nuclei,  the  relevant  Hamiltonian  op¬ 
erator  is  constructed  from  the  classical  energy  through  the  rather  mysterious  process  of 
“quantization” . 


At  each  time  £,  the  solution  fjft^x)  to  the  Schrodinger  equation  represents  the  wave 
function  of  the  quantum  system,  and  so  should  be  a  complex-valued  square-integrable 
function  having  unit  L2  norm:  ||  ip  ||  =  1.  (The  reader  may  wish  to  revisit  Sections  3.5 
and  7.1  for  a  discussion  of  the  basics  of  quantum  mechanics  and  Hilbert  space.)  We 
interpret  the  wave  function  as  a  probability  density  on  the  possible  quantum  states,  and  so 
the  Schrodinger  equation  governs  the  dynamical  evolution  of  quantum  probabilities.  The 
interested  reader  should  consult  a  basic  text  on  quantum  mechanics,  e.g.,  [66,  72, 115 
for  full  details  on  both  the  physics  and  underlying  mathematics. 


Proposition  9.52.  If  ^(t,  x)  is  a  solution  to  the  Schrodinger  equation ,  its  Hermitian 
L2  norm  ||  ^(t,  •)  ||  is  fixed  for  all  time. 

Proof:  Since  the  solution  is  complex- valued,  we  use  the  sesquilinearity  of  the  under¬ 
lying  Hermitian  inner  product,  as  in  (B.19),  to  compute 


d 

dt 


dijj 

~dt 


ip )  +  ( ip 


-^S[ip} ,  ip  \  +  /  ip,  -  ^S[ip]  ^ 

^(S[ip],tp)  +  ^(ip,s[i/j\)  =  0, 


which  vanishes  because  S  is  self-adjoint.  This  implies  that  ||  ^(t,  •)  ||2  is  constant.  Q.E.D. 


As  a  result,  if  the  initial  data  ^(t0,  x)  =  ^o(x)  a  quantum-mechanical  wave  function, 
meaning  that  ||  ||  =  1,  then,  at  each  time  t,  the  solution  fj(t,x)  to  the  Schrodinger 
equation  also  has  norm  1,  and  hence  remains  a  wave  function  throughout  the  evolutionary 
process. 

Apart  from  the  extra  factor  of  i/i,  the  Schrodinger  equation  looks  like  a  diffusion 
equation  (9.122).  This  inspires  us  to  seek  separable  solutions  with  an  exponential  ansatz: 


^(t,  x)  =  eat  v{x). 
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Substituting  this  expression  into  the  Schrodinger  equation  (9.151)  and  canceling  the  com¬ 
mon  exponential  factors  reduces  us  to  the  usual  eigenvalue  problem 


S[v]  =  An,  with  eigenvalue  A  =  i ha. 

By  self-adjointness,  the  eigenvalues  are  necessarily  real.  Let  vk  denote  the  normalized 
eigenfunction,  so  ||  vk  ||  =  1,  associated  with  the  kth  eigenvalue  Xk.  The  corresponding 
eigensolution  of  the  Schrodinger  equation  is  the  complex- valued  function 

Ipk(t,x)  =  e~lXkt/hvk(x). 

Observe  that,  in  contrast  to  the  exponentially  decaying  solutions  to  the  diffusion  equation, 
the  eigensolutions  to  the  Schrodinger  equation  are  periodic,  with  vibrational  frequencies 
ujk  =  —  Xk/h  proportional  to  the  eigenvalues.  (Along  with  constant  solutions  corresponding 
to  the  null  eigenmodes,  if  any.)  The  general  solution  is  a  (quasi) periodic  series  in  the 
fundamental  eigensolutions, 


4>(t,x)  =  ck4>k(t,x)  =  cke  lXkt/hvk(x),  (9.152) 

k  k 

whose  coefficients  are  prescribed  by  the  initial  conditions.  The  periodicity  of  the  summands 
has  the  additional  implication  that,  again  unlike  the  diffusion  equation,  the  Schrodinger 
equation  can  be  run  backwards  in  time,  i.e.,  it  remains  well-posed  in  the  past.  Consequently, 
we  can  determine  both  the  past  and  future  behavior  of  a  quantum  system  from  its  present 
configuration. 

The  eigenvalues  represent  the  energy  levels  of  the  system  described  by  the  Schrodinger 
equation  and  can  be  experimentally  detected  by  exciting  the  system.  For  instance,  when 
an  excited  electron  orbiting  a  nucleus  jumps  back  to  a  lower  energy  level,  it  emits  a  photon 
whose  observed  electromagnetic  spectral  line  corresponds  to  the  difference  between  the 
energies  of  the  two  quantum  levels.  This  motivates  the  use  of  the  term  spectrum  to  describe 
the  eigenvalues  of  a  linear  Hamiltonian  operator. 


Example  9.53.  The  simplest  version  of  the  Schrodinger  equation  is  based  on  the 
derivative  operator  L  =  D1  leading  to  the  self-adjoint  combination  S  =  L*  =  —  D2 
when  subject  to  appropriate  boundary  conditions.  In  this  case,  the  Schrodinger  equation 
(9.151)  reduces  to  the  second-order  partial  differential  equation 


(9.153) 


If  we  impose  the  Dirichlet  boundary  conditions  ^(t,  0)  =  ^(t,  £)  =  0,  then  the  Schrodinger 
equation  (9.153)  governs  the  dynamics  of  a  quantum  particle  that  is  confined  to  the  interval 
0  <  x  <  t\  the  boundary  conditions  imply  that  there  is  zero  probability  of  the  particle 
escaping  from  the  interval. 

According  to  Section  4.1,  the  eigenfunctions  of  the  Dirichlet  eigenvalue  problem 


v"  +  Xv  =  0,  u(0)  =  v(£)  =  0, 


are 


k  7T 


with  eigenvalue 


k27T2 

~P~ 


sm 


? 


for  k  =  1,2,..., 
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where  the  initial  factor  ensures  that  vk  has  unit  L2  norm,  and  hence  is  a  bona  fide  wave 
function.  The  corresponding  oscillatory  eigenmodes  are 


Since  the  temporal  frequencies  uk  =  —  k2n2 /(hi2)  depend  nonlinearly  on  the  wave  number 
kir/£,  the  Schrodinger  equation,  is,  in  fact,  dispersive ,  sharing  many  similarities  with  the 
third-order  linear  equation  (8.90);  see,  for  instance,  Exercises  9.5.25,  27. 


Exercises 


9.5.20.  (a)  Solve  the  following  initial  boundary  value  problem: 

i  fr'iPt  =  V’(bO)  =  0(£,  1)  =  0,  0(  0,x)  =  1. 

(b)  Using  your  solution  formula,  verify  that  ||  f>(t:  • )  ||  =  1  for  all  t. 

9.5.21.  Answer  Exercise  9.5.20  for  the  initial  condition  -0(0,  x)  =  V30  x(l  —  x). 

9.5.22.  Answer  Exercise  9.5.20  when  the  solution  is  subject  to  Neumann  boundary  conditions 

=  Ipx(t,  1)  =  0. 

9.5.23.  Write  down  the  eigenseries  solution  for  the  Schrodinger  equation  on  a  bounded  interval 
[0,£]  when  subject  to  homogeneous  Neumann  boundary  conditions. 

9.5.24.  Given  the  solution  formula  (9.152),  and  assuming  completeness  of  the  eigenfunctions, 

prove  that  ||  0(£,  • )  ||2  =  ^  |  ck  |2  for  all  t. 

k 

9.5.25.  Write  down  the  dispersion  relation,  phase  velocity,  and  group  velocity  for  the  one-dimen¬ 
sional  Schrodinger  equation  (9.153). 

9.5.26.  Show  that  the  real  and  imaginary  parts  of  the  solution  0(£,  x)  =  u(t,x)  +  i  v(t,x)  to  the 
one-dimensional  Schrodinger  equation  (9.153)  are  solutions  to  the  beam  equation  of  Exer¬ 
cise  9.5.19.  What  is  the  wave  speed? 

0  9.5.27.  The  Talbot  effect  for  the  linear  Schrodinger  equation :  Let  u(t,x)  solve  the  periodic  initial¬ 
boundary  value  problem 

\Ut  =  Uxxl  u(t,  —  7r)  =  u(t,  7r),  Ux(t,  —  7r)  =  Ux(t,  7f), 


with  initial  data  u( 0,  x)  =  cr(x)  given  by  the  unit  step  function.  Prove  that  when  t  =  i vp/q, 
where  p,q  are  integers,  the  solution  u{t,x)  is  constant  on  each  interval  Hit  j / q  <  x  < 
hn(j  +  1  )/q  for  integers  j  E  Z.  Hint :  Use  Exercise  6.1.29(d). 


9.5.28.  The  wave  function  if(t,x)  of  a  one-dimensional  free  quantum  particle  of  mass  m  satis¬ 
fies  the  Schrodinger  equation  \f>t  =  —  hf)xx/ (2m)  on  the  real  line  —  oo  <  x  <  oo.  Assum¬ 
ing  that  f  and  its  x  derivatives  decay  reasonably  rapidly  to  zero  as  |  x  |  — oo,  prove  that 


the  particle’s  expected  position  (x)  = 


Hint :  Prove  that 


d2  (x) 
dt 2 


roo  2 

/  x  |  0(t,  x)  |  dx  moves  on  a  straight  line. 

J — oo 


0. 
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C  9.5.29.  Consider  the  periodically  forced  Schrodinger  equation  i h^t  =  —  ^>xx  +  elujt  on  the 

interval  0  <  x  <  1,  subject  to  homogeneous  Dirichlet  boundary  conditions,  (a)  At  which 
frequencies  uj  does  the  forcing  function  excite  a  resonant  response?  (b)  Find  the  solution 
to  the  general  initial  value  problem  for  a  nonresonant  forcing  frequency,  (c)  Find  the  so¬ 
lution  to  the  general  initial  value  problem  for  a  resonant  forcing  frequency.  What  are  the 
conditions  on  h  that  ensure  that  the  resulting  solution  remains  a  wave  function? 

C  9.5.30.  The  Schrodinger  equation  for  the  harmonic  oscillator  is  ih'ipt  =  ^ xx  —  x 2/0.  Write  this 

equation  in  the  self-adjoint  form  (9.151)  under  a  suitable  choice  of  boundary  conditions. 
Write  down  the  self-adjoint  boundary  value  problem  for  the  eigenfunctions. 

Remark :  The  eigenfunctions  are  not  elementary  functions.  After  studying  Section  11.3,  you 
may  wish  to  return  here  to  investigate  its  solutions. 


Chapter  10 

Finite  Elements  and  Weak  Solutions 


In  Chapter  5,  we  studied  the  oldest,  and  in  many  ways  the  simplest,  class  of  numerical 
algorithms  for  approximating  the  solutions  to  partial  differential  equations:  those  based  on 
finite  difference  approximations.  In  the  present  chapter,  we  introduce  the  second  of  the  two 
major  numerical  paradigms:  the  finite  element  method.  Finite  elements  are  of  more  recent 
vintage,  having  first  appeared  soon  after  the  Second  World  War;  historical  details  can  be 
found  in  [113].  As  a  consequence  of  their  ability  to  adapt  to  complicated  geometries,  finite 
elements  have,  in  many  situations,  become  the  method  of  choice  for  solving  equilibrium 
boundary  value  problems  governed  by  elliptic  partial  differential  equations.  Finite  elements 
can  also  be  adapted  to  dynamical  problems,  but  lack  of  space  prevents  us  from  pursuing 
such  extensions  in  this  text. 

Finite  elements  rely  on  a  more  sophisticated  understanding  of  the  partial  differen¬ 
tial  equation,  in  that,  unlike  finite  differences,  they  are  not  obtained  by  simply  replacing 
derivatives  by  their  numerical  approximations.  Rather,  they  are  initially  founded  on  an  as¬ 
sociated  minimization  principle  that,  as  we  learned  in  Chapter  9,  characterizes  the  unique 
solution  to  a  positive  definite  boundary  value  problem.  The  basic  idea  is  to  restrict  the 
minimizing  functional  to  an  appropriately  chosen  finite-dimensional  subspace  of  functions. 
Such  a  restriction  produces  a  finite-dimensional  minimization  problem,  which  can  then 
be  solved  by  numerical  linear  algebra.  When  properly  formulated,  the  restricted  finite¬ 
dimensional  minimization  problem  will  have  a  solution  that  well  approximates  the  true 
minimizer,  and  hence  the  solution  to  the  original  boundary  value  problem.  To  gain  fa¬ 
miliarity  with  the  underlying  principles,  we  will  first  illustrate  the  basic  constructions  in 
the  context  of  boundary  value  problems  for  ordinary  differential  equations.  The  following 
section  extends  finite  element  analysis  to  boundary  value  problems  associated  with  the 
two-dimensional  Laplace  and  Poisson  equations,  thereby  revealing  the  key  features  used 
in  applications  to  the  numerical  solution  of  multidimensional  equilibrium  boundary  value 
problems. 

An  alternative  approach  to  the  finite  element  method,  one  that  can  be  applied  even  in 
situations  in  which  no  minimum  principle  is  available,  is  founded  on  the  concept  of  a  weak 
solution  to  the  differential  equation,  a  construction  of  independent  analytical  importance. 
The  term  “weak”  refers  to  the  fact  that  one  is  able  to  relax  the  differentiability  requirements 
imposed  on  classical  solutions.  Indeed,  as  we  will  show,  discontinuous  shock  wave  solutions 
as  well  as  the  nonsmooth,  and  hence  nonclassical,  solutions  to  the  wave  equation  that  we 
encountered  in  Chapters  2  and  4  can  all  be  rigorously  characterized  through  the  weak 
solution  formulation.  For  the  finite  element  approximation,  rather  than  impose  the  weak 
solution  criterion  on  the  entire  infinite-dimensional  function  space,  one  again  restricts  to  a 
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suitably  chosen  finite-dimensional  subspace.  For  positive  definite  boundary  value  problems, 
which  necessarily  admit  a  minimization  principle,  the  weak  solution  approach  leads  to  the 
same  finite  element  equations. 

A  rigorous  justification  and  proof  of  convergence  of  the  finite  element  approximations 
requires  further  analysis,  and  we  refer  the  interested  reader  to  more  specialized  texts, 
such  as  [6, 113, 126],  In  this  chapter,  we  shall  focus  our  effort  on  understanding  how  to 
formulate  and  implement  the  finite  element  method  in  practical  contexts. 


10.1  Minimization  and  Finite  Elements 


To  explain  the  principal  ideas  underpinning  the  finite  element  method,  we  return  to  the 
abstract  framework  for  boundary  value  problems  that  was  developed  in  Chapter  9.  Recall 
Theorem  9.26,  which  characterizes  the  unique  solution  to  a  positive  definite  linear  system 
as  the  minimizer,  u *  E  [/,  of  an  associated  quadratic  functional  Q:  U  -E  M.  For  boundary 
value  problems  governed  by  differential  equations,  U  is  an  infinite-dimensional  function 
space  containing  all  sufficiently  smooth  functions  that  satisfy  the  prescribed  homogeneous 
boundary  conditions.  (Modifications  to  deal  with  inhomogeneous  boundary  conditions  will 
be  discussed  in  due  course.) 

This  framework  sets  the  stage  for  the  first  key  idea  of  the  finite  element  method.  In¬ 
stead  of  trying  to  minimize  the  functional  Q[u\  over  the  entire  infinite-dimensional  function 
space,  we  will  seek  to  minimize  it  over  a  finite- dimensional  subspace  W  C  U.  The  effect 
is  to  reduce  a  problem  in  analysis  —  a  boundary  value  problem  for  a  differential  equation 
-  to  a  problem  in  linear  algebra,  and  hence  one  that  a  computer  is  capable  of  solving. 
On  the  surface,  the  idea  seems  crazy:  how  could  one  expect  to  come  close  to  finding  the 
minimizer  in  a  gigantic  infinite-dimensional  function  space  by  restricting  the  search  to  a 
mere  finite-dimensional  subspace?  But  this  is  where  the  magic  of  infinite  dimensions  comes 
into  play.  One  can,  in  fact,  approximate  all  (reasonable)  functions  arbitrarily  closely  by 
functions  belonging  to  finite-dimensional  subspaces.  Indeed,  you  are  already  familiar  with 
two  examples:  Fourier  series,  where  one  approximates  rather  general  periodic  functions  by 
trigonometric  polynomials,  and  interpolation  theory,  in  which  one  approximates  functions 
by  ordinary  polynomials,  or,  more  sophisticatedly,  by  splines,  [89,  102].  Thus,  the  finite 
element  idea  perhaps  is  not  as  outlandish  as  it  might  initially  seem. 

To  be  a  bit  more  explicit,  let  us  begin  with  a  linear  operator  L:U  -E  V  between  real 
inner  product  spaces,  where,  as  in  Section  9.1,  (u,u)  is  used  to  denote  the  inner  product 
in  [/,  and  ((v  ,v))  the  inner  product  in  V.  To  ensure  uniqueness  of  solutions,  we  always 
assume  that  L  has  trivial  kernel:  kerL  =  {0}.  According  to  Theorem  9.26,  the  element 
u*  E  U  that  minimizes  the  quadratic  function(al) 


where 


(/,«), 


denotes  the  norm  in  V,  is  the  solution  to  the  linear  system 


(10.1) 


S[u]  =  /,  where  S  =  L*oL,  (10.2) 

with  L*:F  — >  U  denoting  the  adjoint  operator.  The  hypothesis  that  L  has  trivial  kernel 
implies  that  S'  is  a  self-adjoint  positive  definite  linear  operator,  which  implies  that  the 
solution  to  (10.2),  and  hence  the  minimizer  of  Q[u],  is  unique.  In  our  applications,  L 
is  a  linear  differential  operator  between  function  spaces,  e.g.,  the  gradient,  while  Q[u 
represents  a  quadratic  functional,  e.g.,  the  Dirichlet  principle,  and  the  associated  linear 
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system  (10.2)  forms  a  positive  definite  boundary  value  problem,  e.g.,  the  Poisson  equation 
along  with  suitable  boundary  conditions. 

To  form  a  finite  element  approximation  to  the  solution  u *  E  [/,  rather  than  try  to 
minimize  Q[u\  on  the  entire  function  space  [/,  we  now  seek  to  minimize  it  on  a  suitably 
chosen  finite-dimensional  subspace  W  C  U.  We  will  specify  W  by  selecting  a  set  of  linearly 
independent  functions  . . . ,  (pn  E  [/,  and  letting  W  be  their  span.  Thus,  yq, . . . ,  cpn  form 
a  basis  of  W,  whereby  dim  W  =  n,  and  the  general  element  of  IT  is  a  (uniquely  determined) 
linear  combination 


w(x)  =  c1<p  i(x)+  +Cnipn(x ) 


(10.3) 


of  the  basis  functions.  Our  goal  is  to  minimize  Q[w]  over  all  possible  w  E  W;  in  other 
words,  we  need  to  determine  the  coefficients  cx, . . 


c  f  1  such  that 

/  L 


Q[ci^i  + 


+  CnW 


(10.4) 


is  as  small  as  possible.  Substituting  (10.3)  back  into  (10.1)  and  then  expanding,  using 
the  linearity  of  L  and  then  the  bilinearity  of  the  inner  product,  we  find  that  the  resulting 
expression  is  the  quadratic  function 


n  n 


=  2  kio  ci  ci  -  bici  =  \  cTr c  -  cTb’ 

i,j  =  1  t  =  1 

(10.5) 

in  which 

•  c  =  (  c-l,  c2, . . . ,  cn  )T  E  Mn  is  the  vector  of  unknown  coefficients  in 

•  K  =  (k-)  is  the  symmetric  n  x  n  matrix  with  entries 

(10.3); 

^ ij  ((  -L\  i  ]  5  ]  ))  ,  E  j  1,  .  .  .  ,  72, 

(10.6) 

•  b  =  (  61?  &2, . . . ,  bn  )T  is  the  vector  with  entries 

=  i  —  l, . . .  ,n. 

(10.7) 

Note  that  formula  (10.6)  uses  the  inner  product  on  the  target  space  V,  whereas  (10.7) 
relies  on  the  inner  product  on  the  domain  space  U . 

Thus,  once  we  specify  the  basis  functions  the  coefficients  k{-  and  bi  are  all  known 

quantities.  We  have  effectively  reduced  our  original  problem  to  the  finite-dimensional 
problem  of  minimizing  the  quadratic  function  (10.5)  over  all  possible  vectors  c  E  Mn.  The 
symmetric  matrix  K  is,  in  fact,  positive  definite,  since,  by  the  preceding  computation, 

n 


T 


Kc  = 


E 

M  =  1 


^ ij  O 


L[c  i<Pi(x)  + 


+  CnLP 


n  J 


L[w 


>  0. 


(10.8) 


as  long  as  L[w]  ^  0.  Moreover,  our  initial  assumption  tells  us  that  L[w]  =  0  if  and 
only  if  w  =  0,  which,  by  linear  independence,  occurs  only  when  c  =  0.  Thus,  (10.8)  is 
indeed  positive  for  all  c  ^  0.  We  can  now  invoke  the  finite-dimensional  minimization  result 
contained  in  Example  9.25  to  conclude  that  the  unique  minimizer  to  (10.5)  is  obtained  by 
solving  the  associated  linear  system 


K  c  =  b 


c  =  K-1  b. 


whereby 


(10.9) 
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Remark :  When  of  moderate  size,  the  linear  system  (10.9)  can  be  solved  by  basic 
Gaussian  Elimination.  When  the  size  (i.e.,  the  dimension,  n,  of  the  subspace  W)  becomes 
too  large,  as  is  often  the  case  in  dealing  with  partial  differential  equations,  it  is  better  to 
rely  on  an  iterative  linear  system  solver,  e.g.,  Gauss-Seidel  or  Successive  Over-Relaxation 
(SOR);  see  [89,  118]  for  details. 


This  summarizes  the  basic  abstract  setting  for  the  finite  element  method.  The  key 
issue,  then,  is  how  to  effectively  choose  the  finite-dimensional  subspace  W.  Two  candidates 
that  might  spring  to  mind  are  the  space  of  polynomials  of  degree  <  n  and  the  space  of 
trigonometric  polynomials  (truncated  Fourier  series)  of  degree  <  n.  However,  for  a  variety 
of  reasons,  neither  is  well  suited  to  the  finite  element  method.  One  constraint  is  that 
the  functions  in  W  must  satisfy  the  relevant  boundary  conditions  —  otherwise,  W  would 
not  be  a  subspace  of  U .  More  importantly,  in  order  to  obtain  sufficient  accuracy  of  the 
approximate  solution,  the  linear  algebraic  system  (10.9)  will  typically  —  especially  when 
dealing  with  partial  differential  equations  —  be  quite  large,  and  hence  it  is  desirable  that 
the  coefficient  matrix  K  be  as  sparse  as  possible,  i.e.,  have  lots  of  zero  entries.  Otherwise, 
computing  the  solution  may  well  be  too  time-consuming  to  be  of  much  practical  value. 

With  this  in  mind,  the  second  innovative  contribution  of  the  finite  element  method  is  to 
first  (paradoxically)  enlarge  the  space  U  of  allowable  functions  upon  which  to  minimize  the 
quadratic  functional  Q[u].  The  governing  differential  equation  requires  its  (classical)  solu¬ 
tions  to  have  a  certain  degree  of  smoothness,  whereas  the  associated  minimization  principle 
typically  requires  that  they  possess  only  half  as  many  derivatives.  Thus,  for  second-order 
boundary  value  problems,  the  differential  equation  requires  continuous  second-order  deriva¬ 
tives,  while  the  quadratic  functional  Q[u]  involves  only  first-order  derivatives.  It  fact,  it 
can  be  rigorously  shown  that,  under  rather  mild  hypotheses,  the  functional  retains  the 
same  minimizing  solution,  even  when  one  allows  functions  that  fail  to  qualify  as  classical 
solutions  to  the  differential  equation.  We  will  proceed  to  develop  the  method  in  the  context 
of  particular,  fairly  elementary  examples. 


Exercises 


10.1.1.  Let  U  =  {u(x)  G  C2 [0, 7r]  |  a(0)  =  u(tt)  =  0  }  and  V  =  {v{x)  G  C1  [ 0,  tt ]  }  both 

be  equipped  with  the  L  inner  product.  Let  L:U  V  be  given  by  L[u]  =  D[u\  =  u  , 
and  f(x)  =  x  —  1.  (a)  Write  out  the  quadratic  functional  Q[u]  given  by  (10.1).  (b)  Write 
out  the  associated  boundary  value  problem  (10.2).  (c)  Find  the  function  u*(x)  G  U  that 
minimizes  Q[u\.  What  is  the  value  of  Q[u^\?  (d)  Let  W  C  U  be  the  subspace  spanned 
by  sinx  and  sin2x.  Write  out  the  corresponding  finite-dimensional  minimization  problem 
(10.8).  (e)  Find  the  function  w*(x)  G  W  that  minimizes  Q[w].  Is  Q[w*]  >  Q[iqJ?  If  not, 
why  not?  How  close  is  your  finite  element  minimizer  w^{x)  to  the  actual  minimizer  u*(x)l 

10.1.2.  Let  U  =  {  u(x)  G  C2[0,  1]  |  u(0)  =  u(  1)  =  0  }  and  V  =  (n(x)  G  C1  [0, 1] }  both  have 

the  L2  inner  product.  Let  L:U  —>  V  be  given  by  L[u\  =  u(x)  —  u(x),  and  f(x)  =  1 
for  all  x.  (a)  Write  out  the  quadratic  functional  Q[u]  given  by  (10.1).  (b)  Write  out  the 
associated  boundary  value  problem  (10.2).  (c)  Find  the  function  u*(x)  G  U  that  minimizes 
Q[u].  What  is  the  value  of  Q[iqJ?  (d)  Let  W  C  U  be  the  subspace  containing  all  cubic 
polynomials  p(x)  that  satisfy  the  boundary  conditions:  p(0)  =  p(l)  =  0.  Find  a  basis  of 
W  and  then  write  out  the  corresponding  finite-dimensional  minimization  problem  (10.8). 

(e)  Find  the  polynomial  p*(x)  G  W  that  minimizes  Q[p]  for  p  G  W.  Is  Q[p+\  >  Q\  IgJ?  If 
not,  why  not?  How  close  is  your  finite  element  minimizer  p^{x)  to  the  minimizer  u^{x)l 
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10.1.3.  Let  U  =  {  u(x)  G  C2 [1,2]  |  u(  1)  =  u(2)  =  0  },  V  =  {  (n1(x),  v2(x)) 


T 


V\  i  v2 


G  C1  [1,  2]  }. 


both  be  endowed  with  the  L2  inner  product.  Let  L:U  V  be  given  by  L[a]  =  (  ^  1 , 

and  let  f(x)  =  2  for  all  1  <  x  <  2.  (a)  Write  out  the  quadratic  functional  Q[u\  given  by 
(10.1).  (b)  Write  out  the  associated  boundary  value  problem  (10.2).  (c)  Find  the  func¬ 
tion  u*(x)  G  U  that  minimizes  Q[u\.  What  is  the  value  of  Q[iqJ?  (d)  Let  W  C  U  be 
the  subspace  containing  all  cubic  polynomials  p{x)  that  satisfy  the  boundary  conditions 
p(  1)  =  p{ 2)  =  0.  Find  a  basis  of  W  and  then  write  out  the  corresponding  finite-dimensional 
minimization  problem  (10.8).  (e)  Find  the  polynomial  p^{x)  G  W  that  minimizes  Q[p\  for 
p  G  W.  Is  Q[p+]  >  Q[iqJ?  If  not,  why  not?  How  close  is  your  finite  element  minimizer 
p+(x)  to  the  actual  minimizer  u^{x)l 

C  10.1.4.  (a)  Find  the  solution  to  the  boundary  value  problem  —u"  =  x2  —  x,  u(— 1)  =  u(  1)  =  0. 

(b)  Write  down  a  quadratic  functional  Q[u\  that  is  minimized  by  your  solution. 

(c)  Let  W  be  the  subspace  spanned  by  the  two  functions  (1  —  x  ),  x(l  —  x  ).  Find  the  func¬ 
tion  w^(x)  G  W  that  minimizes  the  restriction  of  your  quadratic  functional  to  W.  Compare 
w*  with  your  solution  from  part  (a),  (d)  Answer  part  (c)  for  the  subspace  W  spanned  by 
sin7r£,  sin27rx.  Which  of  the  two  approximations  is  the  better? 

C  10.1.5.  (a)  Find  the  function  u*(x)  that  minimizes  Q[u\  =  \  {x  +  1  )u  (x)2  —  u(x)  dx  over 

the  vector  space  U  consisting  of  C2  functions  satisfying  a(0)  =  u(  1)  =  0.  (b)  Let  W3  C  U 

be  the  subspace  consisting  of  all  cubic  polynomials  w(x)  that  satisfy  the  same  boundary 
conditions.  Find  the  function  w^{x)  that  minimizes  the  restriction  Q[w]  for  w  G  W$. 

r\ 

Compare  w^(x)  and  u*(x):  how  close  are  they  in  the  L  norm?  What  is  the  maximal 
discrepancy  |  w*(x)  —  u*(x)  |  for  0  <  x  <  1?  (c)  Suppose  you  enlarge  your  finite-dimen¬ 
sional  subspace  W4  C  U  to  contain  all  quartic  polynomials  that  satisfy  the  boundary 
conditions.  Is  your  new  finite  element  approximation  better?  Discuss. 


C  10.1.6.  (a)  Find  the  function  u^{x)  that  minimizes  Q[u]  =  ^ex  u  (x)2  —  3 u{pc)  dx  over  the 

space  U  consisting  of  C2  functions  satisfying  the  boundary  conditions  'u(O)  =  u  (1)  =  0. 

(b)  Let  W  C  U  be  the  subspace  containing  all  cubic  polynomials  w(x)  that  satisfy  the 
boundary  conditions.  Find  the  polynomial  w*(x)  that  minimizes  the  restriction  Q[w]  for 

w  G  W .  Compare  w^{x)  and  u*(x):  how  close  are  they  in  the  L2  norm?  What  is  the 


maximal  discrepancy 


W*(x) 


u*{pc)  |  for  0  <  x  <  1? 


10.1.7.  Consider  the  Dirichlet  boundary  value  problem 

—  A u  =  x  (1  —  x)  +  y  (1  —  y),  u(x,  0)  =  u(x,  1)  =  ^(0,  y)  —  u{  1,  y)  =  0, 
on  the  unit  square  {0  <  x,y  <  1}. 

(a)  Find  the  exact  solution  u^(x,y).  Hint:  It  is  a  polynomial. 

(b)  Write  down  a  minimization  principle  Q[u]  that  characterizes  the  solution.  Be  careful  to 
specify  the  function  space  U  over  which  the  minimization  takes  place. 

(c)  Let  W  C  U  be  the  subspace  spanned  by  the  four  functions  sin7rxsin7ry,  sin  2ttx  sin  7ry, 
sin  7tx  sin  27ry,  and  sin  2  7r  x  sin  2  7r  y.  Find  the  function  w*  G  W  that  minimizes  the 
restriction  of  Q[w]  to  w  G  W.  How  close  is  w*  to  the  solution  you  found  in  part  (a)? 

0  10.1.8.  Justify  the  identification  of  (10.4)  with  the  quadratic  function  (10.5). 


10.2  Finite  Elements  for  Ordinary  Differential  Equations 

To  understand  the  preceding  abstract  formulation  in  concrete  terms,  let  us  focus  our  atten¬ 
tion  on  boundary  value  problems  governed  by  a  second-order  ordinary  differential  equation. 
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For  example,  we  might  be  interested  in  solving  a  Sturm-Liouville  problem  (9.71)  subject 
to,  say,  homogeneous  Dirichlet  boundary  conditions.  Once  we  understand  how  the  finite 
element  constructions  work  in  this  relatively  simple  context,  we  will  be  in  a  good  position 
to  extend  the  techniques  to  much  more  general  linear  boundary  value  problems  governed 
by  elliptic  partial  differential  equations. 

For  such  one-dimensional  boundary  value  problems,  a  popular  and  effective  choice 
of  the  finite-dimensional  subspace  W  is  to  employ  continuous,  piecewise  affine  functions. 
Recall  that  a  function  is  affine  if  its  graph  is  a  straight  line:  f(x)  =  ax  +  b.  (The  function 
is  linear ,  in  accordance  with  Definition  B.32,  if  and  only  if  b  =  0.)  A  function  is  called 
piecewise  affine  if  its  graph  consists  of  a  finite  number  of  straight  line  segments;  a  typical 
example  is  plotted  in  Figure  10.1.  Continuity  requires  that  the  individual  segments  be 
connected  together  end  to  end. 

Given  a  boundary  value  problem  on  a  bounded  interval  [a,  6],  let  us  fix  a  finite  collec¬ 
tion  of  nodes 


a  =  x0  <  x1  <  x2  < 


<  Xn_x  <xn  =  b. 


The  formulas  simplify  if  one  uses  equally  spaced  nodes,  but  this  is  not  necessary  for  the 
construction  to  be  carried  out.  Let  W  denote  the  vector  space  consisting  of  all  continu¬ 
ous  functions  w{x)  that  are  defined  on  the  interval  a  <  x  <  6,  satisfy  the  homogeneous 
boundary  conditions,  and  are  affine  when  restricted  to  each  subinterval  [x  -,  x -+1].  On  each 
subinterval,  we  write 


w{x)  =  c  •  +  b-{x  —  x  •),  for  x  ■  <  x  <  x  +1,  j  =  0, . . . ,  n  —  1, 
for  certain  constants  Continuity  of  w{x)  requires 

c  -  =  w(x+ )  =  w(xj)  =  cj_1  +  bj_1  h-_  i,  j  =  1, . . . ,  n  —  1,  (10.10) 

where  hJ_1  =  x-  —  xJ_1  denotes  the  length  of  the  jth  snbinterval.  The  homogeneous 
Dirichlet  boundary  conditions  at  the  endpoints  require 

w{a)  =  c0  =  0,  w(b)  =  cn_1+bn_1hn_1=  0.  (10.11) 

Observe  that  the  function  w(x)  involves  a  total  of  2n  unspecified  coefficients  c0, . . . ,  cn_1, 
60, . . . ,  bn_ i-  The  continuity  conditions  (10.10)  and  the  second  boundary  condition  (10.11) 
uniquely  determine  the  bj.  The  first  boundary  condition  specifies  c0,  while  the  remaining 
n  —  1  coefficients  cx  —  w(xf), . . . ,  cn_1  =  w{xn_1)  are  arbitrary,  specifying  the  values  of 
w{x)  at  the  interior  nodes.  We  conclude  that  the  finite  element  snbspace  W  has  dimension 
n  —  1,  the  number  of  interior  nodes. 
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Remark :  Every  function  w(x)  in  onr  subspace  has  piecewise  constant  first  derivative 
w'(x).  However,  the  jump  discontinuities  in  w'(x)  imply  that  its  second  derivative  w"(x) 
may  well  include  delta  function  impulses  at  the  nodes,  and  hence  w(x)  is  far  from  being  a 
solution  to  the  differential  equation.  Nevertheless,  in  practice,  the  finite  element  minimizer 
w*(x)  E  W  will  (under  suitable  assumptions)  provide  a  reasonable  approximation  to  the 
actual  solution  u^(x). 

The  most  convenient  basis  for  W  consists  of  the  hat  functions ,  which  are  continuous, 
piecewise  affine  functions  satisfying 


Vj{xk) 


1, 

0, 


j  =  k, 
j  +  k, 


for 


J 


n  —  1 


k  =  0. 


n. 


(10.12) 


The  graph  of  a  typical  hat  function  appears  in  Figure  10.2.  The  explicit  formula  is  easily 
established: 


(  x  —  Xj_  x 


Vi  (*)  =  < 


rp  _  rp 

xj  xi- i 

_  rp 

j |  ^ 

rp  _  rp 

X3+ t  X3 


K  ^ 


Xj_  1  <  X  <  Xp 


Xj  <  x  <  XJ  +  1, 


x  <  x  or  x  >  x  •_|_1, 


j  =  l,...,n-  1.  (10.13) 


One  advantage  of  using  these  basis  functions  is  that,  thanks  to  (10.12),  the  coefficients  in 
the  linear  combination 

w(x)=c1<p  1(x)+  •••  +cnifn(x) 
coincide  with  its  values  at  the  nodes: 


c-  =  w[x-),  j  =  1, . . . ,  n. 


j 


(10.14) 


Example  10.1.  Let  k(x)  >  0  for  0  <  x  <  t.  Consider  the  equilibrium  equations 
S[u]  =  —  f  n(x)  -j-^]  =  /(x),  0  <  x  <  l,  n(0)  =  u(f)  —  0, 


dx 


dx 


for  a  nonnniform  bar  with  fixed  ends  and  variable  stiffness  k(x),  that  is  subject  to  an 
external  forcing  f(x).  In  order  to  find  a  finite  element  approximation  to  the  resulting 
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displacement  u(x),  we  begin  with  the  minimization  principle  based  on  the  quadratic  func¬ 
tional 


,  |  k(x)  u'(x) 


f(x)u(x) 


which  is  a  special  case  of  (9.75).  We  divide  the  interval  [0,£]  into  n  equal  subintervals, 
each  of  length  h  =  Ijn.  The  resulting  uniform  mesh  has  nodes 


Xj  =  jh 


3* 


n 


j  =  0 


1  *  *  *  1 


n. 


The  corresponding  finite  element  basis  hat  functions  are  explicitly  given  by 


x-_x  <x<x-, 


3 


<Pj  (*) 


xj  <  X  <  Xj  +  1, 


j  =  1 


(x  —  xj_1)/h, 

( xj+1  ~  x)/h, 

0,  otherwise, 

The  associated  linear  system  (10.9)  has  coefficient  matrix  entries 


n  —  1 


(10.15) 


Kj  =  ivl  ,<d»  =  /  Vi(x)<Pj(x)K(x)dx,  —  — 

Jo 

Since  the  function  (p^x)  vanishes  except  on  the  interval  xi_1  <  x  <  x-+1,  while  <fj(x) 
vanishes  outside  x  -  _  1  <  x  <  x-+1,  the  integral  will  vanish  unless  i  =  j  or  i  =  j  =b  1. 
Moreover, 


1  /h, 

tiW  =  \  ~l/h, 

0, 


xj_1  <  X  <  Xp 
Xj  <  X  <  Xj  + 1, 

otherwise. 


j  1 ,  .  .  .  ,  U  1 


Therefore,  the  finite  element  coefficient  matrix  assumes  the  tridiagonal  form 


K  = 


h2 


s0  +  si  —  si 

—  £q  £q  T  s2  —  s 


\ 


where 


\ 


—  s2  s2  T  <§3  —  s 


—  q  S„_n  +  S„ 


—  s„ 


n— 3  °n— 3  '  °n  — 2  °n  — 2 

$n  —  2  $n  —  2  $n  —  1 


‘Xj  +  1 


sj 


k(x)  dx 


Xj 


(10.16) 


(10.17) 


is  the  total  stiffness  of  the  jth  subinterval.  The  corresponding  right-hand  side  has  entries 


bj  =  {f ,  Vi 


3 


f(x)  <fj{x)  dx 


0 


1 

h 


‘x3 


X  i  —  1 


(x  —  Xj_1)f(x)  dx  + 


‘xj  + 1 


(xj+ 1  —  x)f(x)  dx 


Xn 


(10.18) 


In  practice,  we  do  not  have  to  explicitly  evaluate  the  integrals  (10.17, 18),  but  may  replace 
them  by  suitably  close  numerical  approximations.  When  the  step  size  h  ^  1  is  small,  then 
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the  integrals  are  taken  over  small  intervals,  and  so  the  elementary  trapezoid  rule, 
produces  sufficiently  accurate  approximations: 


24,  108 


Sj  ~  ^[K(xj)  +  k(xj+1)],  bj  «  hf(xj).  (10.19) 

The  resulting  finite  element  system  K c  =  b  is  then  solved  for  c,  whose  entries,  according 
to  (10.14),  coincide  with  the  values  of  the  finite  element  approximation  to  the  solution  at 
the  nodes:  c-  =  w{x-)  u(x]).  Indeed,  the  tridiagonal  Gaussian  Elimination  algorithm, 

[89],  will  rapidly  produce  the  desired  solution.  Since  the  accuracy  of  the  finite  element 
solution  increases  with  the  number  of  nodes,  this  numerical  scheme  allows  us  to  easily 
compute  very  accurate  approximations  to  the  solution  to  the  boundary  value  problem. 

In  particular,  in  the  homogeneous  case  n{x)  =  1,  the  coefficient  matrix  (10.16)  reduces 
to  the  special  form 


2 

-i 


\ 


(10.20) 


In  this  case,  the  jth  equation  in  the  finite  element  linear  system  is,  upon  dividing  by  h. 


tii  —  2  c,-  -{-  c  -  _  i 

-  — - j-2 - 1  =  /(*,-)•  (10.21) 

Since  c-  «  u(xJ)^  the  left-hand  side  coincides  with  the  standard  hnite  difference  approx¬ 
imation  to  minus  the  second  derivative  —ur\x-)  at  the  node  cf.  (5.5).  As  a  result, 
in  this  particular  case  the  hnite  element  and  hnite  difference  numerical  solution  schemes 
happen  to  coincide. 


The  sparse  tridiagonal  nature  of  the  hnite  element  matrix  is  a  consequence  of  the 
fact  that  the  basis  functions  are  zero  on  much  of  the  interval,  or,  in  more  mathematical 
language,  that  they  have  small  support ,  in  the  following  sense. 

Definition  10.2.  The  support  of  a  function  /(x),  written  supp /,  is  the  closure  of 
the  set  where  f(x)  ^  0. 

Thus,  a  point  x  will  belong  to  the  support,  provided  /  is  not  zero  there,  or  at  least 
is  not  zero  at  nearby  points.  For  example,  the  support  of  the  hat  function  (10.13)  is  the 
(small)  interval  [x  ■_1,  x -+1].  The  key  property,  ensuring  sparseness,  is  that  the  integral 
of  the  product  of  two  functions  will  be  zero  if  their  supports  have  empty  intersection,  or, 
slightly  more  generally,  have  only  a  finite  number  of  points  in  common. 

Example  10.3.  Consider  the  boundary  value  problem 


d  ,  du 

~d^{x  +  1)d^  =  1 


it(0)  =  0, 

The  explicit  solution  is  easily  found  by  direct  integration: 


u(  1)  =  0. 


(10.22) 


u(x)  =  —  X  + 


log  (a;  +  1) 
log  2 


(10.23) 
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Figure  10.3.  Finite  element  solution  to  (10.22). 


It  minimizes  the  associated  quadratic  functional 


(x  +  1)  u\x )2 


dx 


(10.24) 


over  the  space  of  all  C2  functions  u(x)  that  satisfy  the  given  boundary  conditions.  The 
finite  element  system  (10.9)  has  coefficient  matrix  given  by  (10.16)  and  right-hand  side 
(10.18),  where 


(1  -\-x)dx  =  h(l  +  Xj)  +  |/i2  =  h  +  /i2(j  +  |) 


3  +  1 


1  dx  =  h. 


The  resulting  piecewise  affine  approximation  to  the  solution  is  plotted  in  Figure  10.3.  The 
first  three  graphs  contain,  respectively,  5,  10,  20  nodes,  so  that  h  =  .2,.1,.05,  while  the 
last  plots  the  exact  solution  (10.23).  The  maximal  errors  at  the  nodes  are,  respectively, 
.000298,  .000075,  .000019,  while  the  maximal  overall  errors  between  the  exact  solution  and 
its  piecewise  affine  finite  element  approximations  are  .00611,  .00166,  .00043.  (One  can  more 
closely  fit  the  solution  curve  by  employing  a  cubic  spline  to  interpolate  the  computed  nodal 
values,  [89,  102],  which  has  the  effect  of  reducing  the  preceding  maximal  overall  errors  by 
a  factor  of,  approximately,  20.)  Thus,  even  when  computed  on  rather  coarse  meshes,  the 
finite  element  approximation  gives  quite  respectable  results. 


Remark :  One  can  obtain  a  smoother,  and  hence  more  realistic,  approximation  to  the 
solution  by  smoothly  interpolating  the  finite  element  approximations  c-  ~  u(xJ)  at  the 
nodes,  e.g.,  by  use  of  cubic  splines,  [89,  102].  Alternatively,  one  can  require  that  the  finite 
element  functions  themselves  be  smoother,  e.g.,  by  making  the  finite  element  subspace 
consist  of  piecewise  cubic  splines  that  satisfy  the  boundary  conditions. 
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Exercises 


10.2.1.  Use  the  finite  element  method  to  approximate  the  solution  to  the  boundary  value 

problem  —  ^  e~x  ^  ^  =  1,  'u(O)  =  u(2)  =  0.  Carefully  explain  how  you  are  setting 

up  the  calculation.  Plot  the  resulting  solutions  and  compare  your  answer  with  the  exact 
solution.  You  should  use  an  equally  spaced  mesh,  but  try  at  least  three  different  mesh 
spacings  and  compare  your  results.  By  inspecting  the  errors  in  your  various  approxima¬ 
tions,  can  you  predict  how  many  nodes  would  be  required  for  six-digit  accuracy  of  the 
numerical  approximation? 

4b  10.2.2.  For  each  of  the  following  boundary  value  problems:  ( i )  Solve  the  problem  exactly. 

( ii )  Approximate  the  solution  using  the  finite  element  method  based  on  ten  equally  spaced 
nodes,  (Hi)  Compare  the  graphs  of  the  exact  solution  and  its  piecewise  affine  finite  element 
approximation.  What  is  the  maximal  error  in  your  approximation  at  the  nodes?  on  the 
entire  interval? 


(a) 


u 


n 


(c)  - 


d_ 

dx 


i^(0)  =  u( 2)  =  0;  (b) 


d_ 

dx 


(1  +  x) 


du 

dx 


=  — x ,  u(  1)  =  u(3)  =  0;  (d)  — 


d_ 

dx 


X 


du 

dx 


=  e 


X 


1,  i^(0)  =  u(  1)  =  0; 


u(— 1)  =  u(l)  =  0. 


4  10.2.3.  (a)  Find  the  exact  solution  to  the  boundary  value  problem  —u"  =  3x,  a(0)  =  u(  1)  =  0. 
(b)  Use  the  finite  element  method  based  on  five  equally  spaced  nodes  to  approximate  the 
solution,  (c)  Compare  the  graphs  of  the  exact  solution  and  its  piecewise  affine  finite  element 
approximation,  (d)  What  is  the  maximal  error  (i)  at  the  nodes?  (ii)  on  the  entire  interval? 

4  10.2.4.  Use  finite  elements  to  approximate  the  solution  to  the  Sturm-Liouville  boundary  value 
problem  —  u"-\-(x  +  l)u  =  xex ,  a(0)  =  0,  u(  1)  =  0,  using  5,  10,  and  20  equally  spaced  nodes. 

4  10.2.5.  (a)  Devise  a  finite  element  scheme  for  numerically  approximating  the  solution  to  the 
mixed  boundary  value  problem 


d 


=  f(x),  a  <  x  <  6,  u(a)  =  0,  u  (b)  =  0. 


dx  \  dx 

(b)  Test  your  method  on  the  particular  boundary  value  problem 


d 


(1  +  x) 


du 


=  1,  0  <  x  <  1,  a(0)  =  0,  i/(l)  =  0. 


dx  \  dx 

using  10  equally  spaced  nodes.  Compare  your  approximation  with  the  exact  solution. 


4  10.2.6.  Consider  the  periodic  boundary  value  problem 

—  u"-\~u  =  x1  u(0)=u(2tt):  u  (0)  =  u  (2 tt) . 

(a)  Write  down  the  analytic  solution,  (b)  Write  down  a  minimization  principle. 

(c)  Divide  the  interval  [ 0,  2 tt ]  into  n  —  5  equal  subintervals,  and  let  Wn  denote  the  sub¬ 
space  consisting  of  all  piecewise  affine  functions  that  satisfy  the  boundary  conditions.  What 
is  the  dimension  of  VFn?  Write  down  a  basis,  (d)  Construct  the  finite  element  approxima¬ 
tion  to  the  solution  to  the  boundary  value  problem  by  minimizing  the  functional  from  part 

(b)  on  the  subspace  Wn.  Graph  the  result  and  compare  with  the  exact  solution.  What  is 
the  maximal  error  on  the  interval?  (e)  Repeat  part  (d)  for  n  =  10,  20,  and  40  subintervals, 
and  discuss  the  convergence  of  your  solutions. 

10.2.7.  Answer  Exercise  10.2.6  when  the  finite  element  subspace  Wn  consists  of  all  periodic 
piecewise  affine  functions  of  period  1,  so  w(x  +  1)  =  w(x).  Which  approximation  is  better? 

£  10.2.8.  Use  the  method  of  Exercise  10.2.7  to  approximate  the  solution  to  the  following  periodic 
boundary  value  problem  for  the  Mathieu  equation : 

—  u"  +  (1  +  cos  x)u  =  1,  1^(0)  =  u(2n ), 


i/(0)  =  u  (2tt). 
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10.2.9.  Consider  the  boundary  value  problem  solved  in  Example  10.3.  Let  Wn  be  the  sub¬ 
space  consisting  of  all  polynomials  u(x)  of  degree  <  n  satisfying  the  boundary  conditions 
'u(O)  =  u(  1)  =  0.  In  this  project,  we  will  try  to  approximate  the  exact  solution  to  the 
boundary  value  problem  by  minimizing  the  functional  (10.24)  on  the  polynomial  subspace 
Wn.  For  n  =  5, 10,  and  20:  (a)  First,  determine  a  basis  for  Wn.  (b)  Set  up  the  minimiza¬ 
tion  problem  as  a  system  of  linear  equations  for  the  coefficients  of  the  polynomial  minimizer 
relative  to  your  basis,  (c)  Solve  the  polynomial  minimization  problem  and  compare  your 
“polynomial  finite  element”  solution  with  the  exact  solution  and  the  piecewise  affine  finite 
element  solution  graphed  in  Figure  10.3. 

10.2.10.  Consider  the  boundary  value  problem  —u"  +  Xu  =  x,  for  0  <  x  <  tt,  with  u( 0)  =  0, 
u{  1)  =  0.  (a)  For  what  values  of  A  does  the  system  have  a  unique  solution?  (b)  For  which 
values  of  A  can  you  find  a  minimization  principle  that  characterizes  the  solution?  Is  the 
minimizer  unique  for  all  such  values  of  A?  (c)  Using  n  equally  spaced  nodes,  write  down 
the  finite  element  equations  for  approximating  the  solution  to  the  boundary  value  problem. 
Note :  Although  the  finite  element  construction  is  supposed  to  work  only  when  there  is  a 
minimization  principle,  we  will  consider  the  resulting  linear  algebraic  system  for  any  value 
of  A.  (d)  Select  a  value  of  A  for  which  the  solution  can  be  characterized  by  a  minimization 
principle  and  verify  that  the  finite  element  approximation  with  n  —  10  approximates  the 
exact  solution,  (e)  Experiment  with  other  values  of  A.  Does  your  finite  element  solution 
give  a  good  approximation  to  the  exact  solution  when  it  exists?  What  happens  at  values  of 
A  for  which  the  solution  does  not  exist  or  is  not  unique? 


10.3  Finite  Elements  in  Two  Dimensions 


The  same  basic  framework  underlies  the  adaptation  of  finite  element  techniques  for  nu¬ 
merically  approximating  the  solution  to  boundary  value  problems  governed  by  elliptic 
partial  differential  equations.  In  this  section,  we  concentrate  on  the  simplest  case:  the 
two-dimensional  Poisson  equation.  Having  mastered  this,  the  reader  will  be  well  equipped 
to  carry  over  the  method  to  more  general  equations  and  higher  dimensions.  As  before,  we 
concentrate  on  the  practical  design  of  the  finite  element  procedure,  and  refer  the  reader 
to  more  advanced  texts,  e.g.,  [6, 113, 126],  for  the  analytical  details  and  proofs  of  conver¬ 
gence.  Most  of  the  multi-dimensional  complications  lie  not  in  the  underlying  theory,  but 
rather  in  the  realm  of  data  management  and  organization. 

For  specificity,  consider  the  homogeneous  Dirichlet  boundary  value  problem 

—  Au  =  f  in  O,  u  =  0  on  <90,  (10.25) 

on  a  bounded  domain  O  C  IR2.  According  to  Theorem  9.31,  the  solution  u*(x,y)  is  char¬ 
acterized  as  the  unique  minimizer  of  the  Dirichlet  functional 


Q[u 


1 

2 


\7u 


—  (u ,  f )  =  I  I  (\u2x  +  \v?y  —  f  u)  dx  dy  (10.26) 

J  Jn 


among  all  C2  functions  u(x,  y )  that  satisfy  the  prescribed  boundary  conditions. 

To  construct  a  finite  element  approximation,  we  restrict  the  Dirichlet  functional  to  a 
suitably  chosen  finite-dimensional  subspace.  As  in  the  one-dimensional  version,  the  most 
effective  subspaces  contain  functions  that  may  lack  the  requisite  degree  of  smoothness  that 
qualifies  them  as  candidate  solutions  to  the  partial  differential  equation.  Nevertheless,  they 
will  provide  good  approximations  to  the  actual  classical  solution.  Another  important  prac¬ 
tical  consideration,  ensuring  sparseness  of  the  finite  element  matrix,  is  to  employ  functions 
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Figure  10.4.  Triangulation  of  a  planar  domain. 


that  have  small  support,  meaning  that  they  vanish  on  most  of  the  domain.  Sparseness  has 
the  benefit  that  the  solution  to  the  linear  finite  element  system  can  be  relatively  rapidly 
calculated,  usually  by  application  of  an  iterative  numerical  scheme  such  as  the  Gauss-Seidel 
or  SOR  methods  discussed  in  [89,  118  . 


Triangulation 

The  first  step  is  to  introduce  a  mesh  consisting  of  a  finite  number  of  nodes  xz  =  (xl,yl), 
l  =  1  usually  lying  inside  the  domain  OcM2.  Unlike  finite  difference  schemes,  finite 

element  methods  are  not  tied  to  a  rectangular  mesh,  thus  endowing  them  with  considerably 
more  flexibility  in  the  allowable  discretizations  of  the  domain.  We  regard  the  nodes  as  the 
vertices  of  a  triangulation  of  the  domain,  consisting  of  a  collection  of  non-overlapping  small 
triangles,  which  we  denote  by  T1? . . . , TN,  whose  union  T*  =  (J VTV  approximates  U;  see 
Figure  10.4  for  a  typical  example.  The  nodes  are  split  into  two  categories  —  interior  nodes 
and  boundary  nodes ,  the  latter  lying  on  or  close  to  <9fl.  A  curved  boundary  will  thus  be 
approximated  by  the  polygonal  boundary  dT \  of  the  triangulation,  whose  vertexvertices 
are  the  boundary  nodes.  Thus,  in  any  practical  implementation  of  a  finite  element  scheme, 
the  first  requirement  is  a  routine  that  will  automatically  triangulate  a  specified  domain  in 
some  “reasonable”  manner,  as  explained  below. 

As  in  our  one-dimensional  construction,  the  functions  w(pc,  y)  in  the  finite-dimensional 
subspace  W  will  be  continuous  and  piecewise  affine ,  which  means  that,  on  each  triangle, 
the  graph  of  re  is  a  flat  plane  and  hence  has  the  formula^ 

w(x,  y)  =  au  +  fdu  x  +  7^  y  when  (a;,  y)  E  T^,  (10.27) 

for  certain  constants  cffi,  /3^,  7U  Continuity  of  w  requires  that  its  values  on  a  common  edge 
between  two  triangles  must  agree,  and  this  will  impose  compatibility  constraints  on  the 
coefficients  7^  and  av ,  (3U ,  7^  associated  with  adjacent  pairs  of  triangles  T '  and  Tv. 

The  full  graph  of  the  piecewise  affine  function  z  =  w(x,y)  forms  a  connected  polyhedral 
surface  whose  triangular  faces  lie  above  the  triangles  Tv\  see  Figure  10.5  for  an  illustration. 
In  addition,  we  require  that  the  piecewise  affine  function  w(x,y)  vanish  at  the  boundary 
nodes,  which  implies  that  it  vanishes  on  the  entire  polygonal  boundary  of  the  triangulation, 


Here  and  subsequently,  the  index  v  is  a  superscript,  not  a  power. 


412 


10  Finite  Elements  and  Weak  Solutions 


Figure  10.5.  Piecewise  affine  function. 


<9T^,  and  hence  (approximately)  satisfies  the  homogeneous  Dirichlet  boundary  conditions 
on  the  curved  boundary  of  the  original  domain,  dfl. 

The  next  step  is  to  choose  a  basis  of  the  subspace  of  piecewise  affine  functions  as¬ 
sociated  with  the  given  triangnlation  and  subject  to  the  imposed  homogeneous  Dirichlet 
boundary  conditions.  The  analogue  of  the  one-dimensional  hat  function  (10.12)  is  the  pyra¬ 
mid  function  y),  which  has  the  value  1  at  a  single  node  xz  =  (xl,yl),  and  vanishes  at 
all  the  other  nodes: 


1,  i  =  l, 

0,  i  ^  l. 


(10.28) 


Because,  on  any  triangle,  the  pyramid  function  y)  is  uniquely  determined  by  its  values 
at  the  vertices,  it  will  be  nonzero  only  on  those  triangles  that  have  the  node  xz  as  one  of 
their  vertices.  Hence,  as  its  name  implies,  the  graph  of  p>l  forms  a  pyramid  of  unit  height 
sitting  on  a  flat  plane;  a  typical  example  appears  in  Figure  10.6. 

The  pyramid  functions  ipL(x:y)  associated  with  the  interior  nodes  xz  automatically 
satisfy  the  homogeneous  Dirichlet  boundary  conditions  on  the  boundary  of  the  domain  — 
or,  more  correctly,  on  the  polygonal  boundary  of  the  triangulated  domain.  Thus,  the  finite 
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element  subspace  W  is  the  span  of  the  interior  node  pyramid  functions,  and  so  a  general 
piecewise  affine  function  w  G  W  is  a  linear  combination  thereof: 

n 

w(x,  v)  =  CiViix,  y),  (10.29) 

1=1 

where  the  sum  ranges  over  the  n  interior  nodes  of  the  triangulation.  Owing  to  the  original 
specification  (10.28)  of  the  pyramid  functions,  the  coefficients 

C1  =  w(xi,Vi)  ~  u(xi,Vi),  1  =  1, (10.30) 

are  the  same  as  the  values  of  the  finite  element  approximation  w(x,y)  at  the  interior 
nodes.  This  immediately  implies  linear  independence  of  the  pyramid  functions,  since  the 
only  linear  combination  that  vanishes  at  all  nodes  is  the  trivial  one  c1  =  •  •  •  =  cn  =  0. 

Determining  the  explicit  formulas  for  the  pyramid  functions  is  not  difficult.  On  one  of 
the  triangles  Tv  that  has  xz  as  a  vertex,  (pt(x,y)  will  be  the  unique  affine  function  (10.27) 
that  takes  the  value  1  at  the  vertex  xz  and  0  at  its  other  two  vertices  x2  and  x  -.  Thus,  we 
seek  a  formula  for  an  affine  function  or  element 


(x,  y )  =  «r  +  Pi  x  +  ii  y , 

that  takes  the  prescribed  values 


(x,y)  G  Tv, 


(10.31) 


uui{xi,yi) 

uUxjiVj) 

u!(xi,yk) 


«r  +  Pi  xi  +  a  y% 

ai  +  Pi  xj  +  7 1  Vj 
«r + pi  xi  +  ~u  Vi 


=  0, 
=  0, 
=  1. 


(10.32) 


Solving  this  linear  system  for  the  coefficients  —  using  either  Cramer’s  Rule  or  direct  Gauss¬ 
ian  Elimination  —  produces  the  explicit  formulas 


a 


V 


xiyj  ~xiU 


3 


A 


P 


V 


Vi-y 


3 


V 


where  the  denominator 


A,,  =  det 


V 


A 

<1 

xi 

yA 

X3 

Vj 

Xl 

yj 

7« 


V 


*  iAj  ^ 


A 


(10.33) 


V 


—  d=  2  area  T, 


V 


(10.34) 


is,  up  to  sign,  twice  the  area  of  the  triangle  Tv\  see  Exercise  10.3.5. 

Example  10.4.  Consider  an  isosceles  right  triangle  T  with  vertices 


X-L  =  (0,0) 


X, 


(1.0) 


X, 


(0,1) 


Using  (10.33-34)  (or  solving  the  linear  system  (10.32)  directly),  we  immediately  produce 
the  three  corresponding  affine  elements 


u1(x,y)  =  1-x-  y, 


u2  (x,y)  =  x, 


u3(x,y)  =  y. 


(10.35) 


As  required,  each  uj1  equals  1  at  the  vertex  xz  and  is  zero  at  the  other  two  vertices. 
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Figure  10.7.  Vertex  polygons. 


A  pyramid  function  is  then  obtained  by  piecing  together  the  individual  affine  elements: 

,  \  f  Vi  (. x ,  y),  if  (x,  y)  €  T  and  x,  is  a  vertex  of  T  , 

<Pi(x,y)  =  <  .  (10.36) 

f  0,  otherwise. 

Continuity  of  <fi(x ,  y)  is  assured,  since  the  constituent  affine  elements  have  the  same  values 
at  common  vertices,  and  hence  also  along  common  edges.  The  support  of  the  pyramid 
function  (10.36)  is  the  vertex  polygon 

supp  <pt  =  Pt  =  (J  Tv  (10.37) 

V 

consisting  of  all  the  triangles  Tv  that  have  the  node  xz  as  a  vertex.  In  other  words, 
(pt(x:y)  =  0  whenever  (x,y)  ^  Plr  The  node  xz  lies  on  the  interior  of  its  vertex  polygon 
Pz,  while  the  vertices  of  Pt  are  all  the  nodes  connected  to  xz  by  a  single  edge  of  the 
triangnlation.  In  Figure  10.7,  the  shaded  regions  indicate  two  of  the  vertex  polygons  for 
the  triangulation  in  Figure  10.4. 

Example  10.5.  The  simplest,  and  most  common,  triangulations  are  based  on  regular 
meshes.  For  example,  suppose  that  the  nodes  he  on  a  square  grid,  and  so  are  of  the  form 
xi  j  =  (ih  +  a,  j  h  +  6),  where  (i,j)  run  over  a  collection  of  integer  pairs,  h  >  0  is  the 
inter- node  spacing,  and  (a,  b )  represents  an  overall  offset.  If  we  choose  the  triangles  to  all 
have  the  same  orientation,  as  in  the  first  picture  in  Figure  10.8,  then  the  vertex  polygons 
all  have  the  same  shape,  consisting  of  six  triangles  of  total  area  3  h2  —  the  shaded  region. 
On  the  other  hand,  if  we  choose  an  alternating  triangulation,  as  in  the  second  picture,  then 
there  are  two  types  of  vertex  polygons.  The  first,  consisting  of  four  triangles,  has  area  2  h2, 
while  the  second,  containing  eight  triangles,  has  twice  the  area,  4  h2.  In  practice,  there  are 
good  reasons  to  prefer  the  former  triangulation. 

In  general,  to  ensure  convergence  of  the  finite  element  solution  to  the  true  minimizer, 
one  should  choose  triangulations  that  satisfy  the  following  properties: 

•  The  three  side  lengths  of  any  individual  triangle  should  be  of  comparable  size,  and 

so  long,  skinny  triangles  and  obtuse  triangles  should  be  avoided. 

•  The  areas  of  nearby  triangles  Tv  should  not  vary  too  much. 

•  The  areas  of  nearby  vertex  polygons  PL  should  also  not  vary  too  much. 
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Figure  10.8.  Square  mesh  triangulations. 


While  the  nearby  triangles  should  be  of  comparable  size,  one  might  very  well  allow  wide 
variations  over  the  entire  domain,  with  small  triangles  in  regions  where  the  solution  is 
changing  rapidly,  and  large  triangles  in  less  active  regions. 


Exercises 


10.3.1.  Sketch  a  triangulation  of  the  following  domains  so  that  all  triangles  have  side  length 

at  most  .5:  (a)  a  unit  square;  (b)  an  isosceles  triangle  with  vertices  (—.5,0),  (.5,0)  and 

(0, 1);  (c)  the  square  {  |  x  |,  |  y  |  <2}  with  the  hole  {  |  x  |,  |  y  |  <  1 }  removed; 

(d)  the  unit  disk;  (e)  the  annulus  1  <  ||  x  ||  <  2. 

10.3.2.  Describe  the  vertex  polygons  for  a  triangulation  that  uses  regular  equilateral  triangles. 

10.3.3.  Are  there  any  restrictions  on  the  number  of  sides  a  vertex  polygon  can  have? 

10.3.4.  Find  the  three  finite  element  functions  aq(x,y),  ca2(x,y),  ca3 (x,y),  associated  with 

(a)  the  triangle  having  vertices  (1,0),  (0, 1),  and  (1, 1); 

(b)  the  triangle  having  vertices  (0,  1),  (1,  —  1),  and  (-1,-1); 

(c)  an  equilateral  triangle  centered  at  the  origin  having  one  vertex  at  (1,0). 

0  10.3.5.  (a)  Prove  that  the  area  of  a  planar  triangle  T  with  vertices  (a,  6),  (c,  d),  (e,  /)  is  equal  to 

( 1  a  b\ 

-i 

2  |  A  |,  where  A  =  det  1  c  d  -(b)  Prove  that  A  >  0  if  and  only  if  the  vertices  of  the 

\i  e  // 

triangle  are  listed  in  counterclockwise  order. 


0  10.3.6.  Give  a  detailed  justification  of  the  continuity  of  the  pyramid  function  (10.36). 

T  10.3.7.  An  alternative  to  triangular  elements  is  to  employ  piecewise  bi-affine  functions ,  mean¬ 
ing  y )  =  a  +  f3x  +  yy  +  5xy,  on  rectangles,  (a)  Suppose  R  is  a  rectangle  with  vertices 
(x1,y1),  (x2,y2)5  (#35  2/3)5  (#452/4)5  whose  sides  are  parallel  to  the  coordinate  axes.  Prove 

that,  for  each  l  =  1, . . . ,  4,  there  is  a  unique  bi-affine  function  Cc^(x,  y)  defined  on  R  that  has 
the  value  uo^x^yj)  =  1  at  one  vertex  while  c Jl(xi,yi)  =0,  i  7^  /,  at  the  other  three  vertices. 
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(b)  Write  out  the  four  bi-affine  functions  ca1(x,  y), . . . ,  ca4(x,  y),  when 

(i)  R  =  {0  <  x,  y  <  1 },  (ii)  R  =  {  —  1  <  x,  y  <  1 }.  (c)  Does  the  result  in  part  (a)  hold  for 
rectangles  whose  sides  are  not  aligned  with  the  axes?  For  general  quadrilaterals? 


The  Finite  Element  Equations 


We  now  seek  to  approximate  the  solution  to  the  homogeneous  Dirichlet  boundary  value 
problem  by  restricting  the  Dirichlet  functional  (10.26)  to  the  selected  finite  element  sub¬ 
space  W.  Using  the  general  framework  of  Section  10.1,  we  substitute  the  formula  (10.29) 
for  a  general  element  of  W  into  the  quadratic  Dirichlet  functional  (9.82).  Expanding,  we 
obtain 


f{x,y) 


l 


n 


i,3  =  1 


r  r 

ij  S 


^  bici  —  \  c T K c  —  bTc. 
i  —  1 


dx  dy 

(10.38) 


Here  K  =  (k-)  is  a  symmetric  n  x  n  matrix,  while  b  =  ( b2, . . . ,  6n  )T  is  a  vector  in  IRn, 
with  respective  entries 


kij  =  <(  ,  V^.  )>  =  j[  Vifi  ■  v^.  dx  dy, 

bi  =  (f,lPi)=  f  Vi  dx  dy, 

J  Jn 


(10.39) 


which  also  follow  directly  from  the  general  formulas  (10.6-7).  Thus,  the  finite  element 
approximation  (10.29)  will  minimize  the  quadratic  function 


P(c)  =  |  ctKc  —  bTc  (10.40) 

over  all  possible  choices  of  coefficients  c  =  ( c1?  c2, . . . ,  cn  )T  G  Mn,  i.e.,  over  all  possible 
function  values  at  the  interior  nodes.  As  above,  the  minimizer’s  coefficients  are  obtained 
by  solving  the  associated  linear  system 


K  c  =  b,  (10.41) 

using  either  Gaussian  Elimination  or  a  suitable  iterative  linear  systems  solver. 

To  find  explicit  formulas  for  the  matrix  coefficients  ktJ  in  (10.39),  we  begin  by  noting 
that  the  gradient  of  the  affine  element  (10.31)  is  equal  to 


Sz 


V 


1 


A 


V 


Vi  -  Vj 

ry»  _  ry> 

j  ^ 


(x,y)  £  Tv, 


(10.42) 


which  is  a  constant  vector  inside  the  triangle  Tvl  while  Vc u"  =  0  outside  Tv.  Therefore, 


V^(x,y) 


? 


if  (x,y)  G  Tv  that  has  xz  as  a  vertex, 
otherwise. 


(10.43) 
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Actually,  (10.43)  is  not  quite  correct,  since  the  gradient  is  not  well  defined  on  the  boundary 
of  a  triangle  Tu,  but  this  will  not  cause  us  any  difficulty  in  evaluating  the  ensuing  integrals. 

We  will  approximate  integrals  over  the  domain  O  by  summing  the  corresponding  in¬ 
tegrals  over  the  individual  triangles  —  which  relies  on  our  assumption  that  the  polygonal 
boundary  of  the  triangulation  dT*  is  a  reasonably  close  approximation  to  the  true  boundary 
<90.  In  particular, 


kij  ~  N  f  f  V<Pi''V<PjdxdV  =  ^  Kj-  (10.44) 

V  J  v 

Now,  according  to  (10.43),  one  or  the  other  gradient  in  the  integrand  will  vanish  on  the 
entire  triangle  Tv  unless  both  x2  and  x  •  are  vertices.  Therefore,  the  only  terms  contributing 
to  the  sum  are  those  triangles  Tv  that  have  both  xz  and  x  •  as  vertices.  If  i  ^  j,  there  are 
only  two  such  triangles,  having  a  common  edge,  while  if  i  =  J,  every  triangle  in  the  zth 
vertex  polygon  Pi  contributes.  The  individual  summands  are  easily  evaluated,  since  the 
gradients  are  constant  on  the  triangles,  and  so,  by  (10.43), 


g i  •  g  j  dx  dy  =  gui  ■  g"  area  Tv 


1/./ 

2  oj 


Let  Tv  have  vertices  xi5x-,xz.  Then,  by  (10.34,42,43), 


1  (Vj  -  Vi){Vi  -  Vi)  +  (*I  -  xj){xi  -  X,) 

2  (AJ2 


1  (Vj  —  Vi)2  +  (xi  —  Xj)2 

A 

<1 

-  x* 

2  (AJ2 

2 

A 

<1 

(x»  ~  xt)  •  (x»  ~  xj)  +  (x»  ~  *0  •  (xj  ~  xt) 

2  A 

V 


=  -Kj-k 


v 

il  ' 


5 


i  0  3, 

(10.45) 


In  this  manner,  each  triangle  Tv  specihes  a  collection  of  six  different  coefficients,  kf-  —  fcT, 
indexed  by  its  vertices,  and  known  as  the  elemental  stiffnesses  of  Tv.  Interestingly,  the 
elemental  stiffnesses  depend  only  on  the  three  vertex  angles  in  the  triangle  and  not  on 
its  size.  Thus,  similar  triangles  have  the  same  elemental  stiffnesses.  Indeed,  according  to 
Exercise  10.3.13, 


Ki  =  !(c°t  Qj  +  cot  @i)i  while  kf-  =  kji  =  —  \  cot  Of,  i  ^  j,  (10.46) 

where  0  <  Of  <  n  denotes  the  angle  in  Tv  at  the  vertex  xz . 

Example  10.6.  The  right  triangle  with  vertices  xx  =  (0,0),  x2  =  (1,0),  x3  =  (0, 1) 
has  elemental  stiffnesses 


^ii  —  1 


k  —  k  —  1 
^22  —  ^33  —  2  ’ 


■1 

b  -  b  -  b  -  b  _L 

^12  —  ^21  —  ^13  —  ^31  —  2  ’ 


^23  —  ^32  —  O' 


(10.47) 


The  same  holds  for  any  other  isosceles  right  triangle,  provided  its  vertices  are  labeled  in 
the  same  manner.  Similarly,  an  equilateral  triangle  has  all  60°  angles,  and  so  its  elemental 
stiffnesses  are 

=  ^22  =  ^33  =  75  ~  -5774, 

7/  -  -  //  -  -  //  -  -  //  -  -  //  -  -  //  -  -  -  t  — j 

^12  —  ^21  —  ^13  —  ^31  —  ^23  —  ^32  —  2\/3  ~ 


-.2887. 


(10.48) 
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Exercises 


10.3.8.  Write  down  the  elemental  stiffnesses  for:  (a)  the  triangle  with  vertices  (0, 1),  (—1,2), 

(0,  —1);  (b)  the  triangle  with  vertices  (1, 1),  (—1,  1),  (0,  —2);  (c)  a  30  —  60  —  90  degree  right 
triangle;  (d)  a  right  triangle  with  side  lengths  3,  4,  5;  (e)  an  isosceles  triangle  of  height  3 
and  base  2;  (f)  a  “golden”  isosceles  triangle  with  angles  36°,  72°,  72°. 


0  10.3.9.  A  rectangular  mesh  has  nodes  x—  =  (i  Ax  +  a,  j  Ay  +  6),  where  Ax,  A y  >  0  are, 
respectively,  the  horizontal  and  vertical  step  sizes.  Find  the  elemental  stiffnesses  for  the 
triangles  associated  with  such  a  rectangular  mesh. 


10.3.10.  True  orjalse:  Let  T  be  a  triangle,  and  T  a  triangle  obtained  by  rotating  T  by  60  . 
Then  T  and  T  have  the  same  elemental  stiffnesses. 

10.3.11.  Prove  that  the  gradient  (10.42)  of  the  affine  element  is  equal 


to  Vc vf  =  aj'  2  a^,  where  is  the  altitude  vector  that  goes 
to  the  vertex  from  its  opposite  side,  as  indicated  in  the  figure. 

10.3.12.  Explain  why  the  pyramid  functions  are  linearly  independent. 

0  10.3.13.  Prove  formulas  (10.46). 


4 


4 


Assembling  the  Elements 

The  elemental  stiffnesses  of  each  triangle  will  contribute,  through  the  summation  (10.44), 
to  the  finite  element  coefficient  matrix  K.  We  begin  by  constructing  a  larger  matrix  iF, 
which  we  call  the  full  finite  element  matrix ,  of  size  m  x  m,  where  m  is  the  total  number 
of  nodes  in  our  triangulation,  including  both  interior  and  boundary  nodes.  The  rows  and 
columns  of  K  are  labeled  by  the  nodes  x1? . . . ,  xm.  Let  Kv  —  (AT)  be  the  corresponding 
m  x  m  matrix  containing  the  elemental  stiffnesses  kf-  of  Tu  in  the  rows  and  columns 

indexed  by  its  vertices,  and  all  other  entries  equal  to  0.  Thus,  Kv  will  have  (at  most)  nine 
nonzero  entries.  The  resulting  m  x  m  matrices  are  summed  together  over  all  the  triangles 
, . . . ,  Tn  ,  whereby 

N 

k  =  Y.k^  (10-49) 

U  —1 

in  accordance  with  (10.44). 

The  full  finite  element  matrix  K  is  too  large,  since  its  rows  and  columns  include  all 
the  nodes,  whereas  the  finite  element  matrix  K  appearing  in  (10.41)  refers  only  to  the  n 
interior  nodes.  The  reduced  n  x  n  finite  element  matrix  K  is  simply  obtained  from  K  by 
deleting  all  rows  and  columns  indexed  by  boundary  nodes,  retaining  only  the  elements  k- 
for  which  both  xz  and  x  •  are  interior  nodes.  For  the  homogeneous  boundary  value  problem, 
this  is  all  we  require.  As  we  will  subsequently  see,  inhomogeneous  boundary  conditions  are 
most  easily  handled  by  retaining  (another  part  of)  the  full  matrix  K. 

The  easiest  way  to  absorb  the  construction  is  by  working  through  a  particular  example. 

Example  10.7.  A  metal  plate  has  the  shape  of  an  oval  running  track,  consisting 
of  a  rectangle,  with  side  lengths  1  m  by  2  m,  and  two  semi-circular  disks  glued  onto  its 
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Figure  10.9.  The  oval  plate. 
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Figure  10.10.  A  coarse  triangulation  of  the  oval  plate. 


shorter  ends,  as  sketched  in  Figure  10.9.  The  plate  is  subject  to  a  heat  source,  while  its 
edges  are  held  at  a  fixed  temperature.  The  problem  is  to  find  the  equilibrium  temperature 
distribution  within  the  plate.  Mathematically,  we  must  solve  the  planar  Poisson  equation, 
subject  to  Dirichlet  boundary  conditions,  for  the  equilibrium  temperature  u(x,y). 

Let  us  describe  how  to  set  up  the  finite  element  approximation.  We  begin  with  a 
very  coarse  triangulation  of  the  plate,  which  will  not  give  particularly  accurate  results, 
but  serves  to  illustrate  how  to  go  about  assembling  the  finite  element  matrix.  We  divide 
the  rectangular  part  of  the  plate  into  eight  right  triangles,  while  each  semicircular  end 
will  be  approximated  by  three  equilateral  triangles.  The  triangles  are  numbered  from  1 
to  14  as  indicated  in  Figure  10.10.  There  are  13  nodes  in  all,  numbered  as  in  the  second 
figure.  Only  nodes  1,2,3  are  interior,  while  the  boundary  nodes  are  labeled  4  through  13 
in  counterclockwise  order  starting  at  the  top.  The  full  finite  element  matrix  K  will  have 
size  13  x  13,  its  rows  and  columns  labeled  by  all  the  nodes,  while  the  reduced  matrix  K 
appearing  in  the  finite  element  equations  (10.41)  consists  of  the  upper  left  3x3  submatrix 
of  K  corresponding  to  the  three  interior  nodes. 

For  each  v  —  1,...,14,  the  triangle  Tv  will  contribute  its  elemental  stiffnesses,  as 
indexed  by  its  vertices,  to  the  matrix  K  through  a  summand  Kv.  For  example,  the  first 
triangle  Tx  is  equilateral,  and  so  has  elemental  stiffnesses  (10.48).  Its  vertices  are  labeled 
1,5,  and  6,  and  therefore  we  place  the  stiffnesses  in  the  rows  and  columns  numbered  1,5,6 
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to  form  the  summand 
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where  all  the  undisplayed  entries  in  the  full  13  x  13  matrix  are  0.  The  next  triangle  T2  has 
the  same  equilateral  elemental  stiffness  matrix  (10.48),  but  now  its  vertices  are  1,  6,  7,  and 
so  it  will  contribute 
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Similarly  for  K3l  with  vertices  1,7,8.  On  the  other  hand,  T4  is  an  isosceles  right  triangle, 
and  so  has  elemental  stiffnesses  (10.47).  Its  vertices  are  labeled  1,  4,  and  5,  with  vertex  5 
at  the  right  angle.  Therefore,  its  contribution  is 
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Continuing  in  this  manner,  we  assemble  14  contributions  K1, . . . ,  iF14,  each  with  at  most 
9  nonzero  entries.  The  full  finite  element  matrix  is  their  sum 
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K  = 


Kx  +  K2  + 

/  3.732 
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(10.50) 
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Since  nodes  1,2,3  are  interior,  the  reduced  finite  element  matrix 


/  3.732  -1  0  \ 

K=  -1  4  -1 

\  0  — 1  3.732  j 


(10.51) 


uses  only  the  upper  left  3x3  block  of  K.  Clearly,  it  would  not  be  difficult  to  directly 
construct  iF,  bypassing  K  entirely. 

For  a  finer  triangulation,  the  construction  is  similar,  but  the  matrices  become  much 
larger.  The  procedure  can,  of  course,  be  automated.  Fortunately,  if  we  choose  a  very 
regular  triangulation,  then  we  do  not  need  to  be  nearly  as  meticulous  in  assembling  the 
stiffness  matrices,  since  many  of  the  entries  are  the  same.  The  simplest  case  employs  a 
uniform  square  mesh,  and  so  triangulates  the  domain  into  isosceles  right  triangles.  This  is 
accomplished  by  laying  out  a  relatively  dense  square  grid  over  the  domain  Q  C  M2.  The 
interior  nodes  are  the  grid  points  that  fall  inside  the  oval  domain,  while  the  boundary 
nodes  are  all  those  grid  points  lying  adjacent  to  one  or  more  of  the  interior  nodes,  and 
are  near  but  not  necessarily  precisely  on  the  boundary  <912.  Figure  10.11  shows  the  nodes 
in  a  square  grid  with  intermesh  spacing  h  =  .2.  While  a  bit  crude  in  its  approximation 
of  the  boundary  of  the  domain,  this  procedure  does  have  the  advantage  of  making  the 
construction  of  the  associated  finite  element  matrix  relatively  painless. 

For  such  a  mesh,  all  the  triangles  are  isosceles  right  triangles,  with  elemental  stiffnesses 
(10.47).  Summing  the  corresponding  matrices  Kv  over  all  the  triangles,  as  in  (10.49),  we 
find  that  the  rows  and  columns  of  K  corresponding  to  the  interior  nodes  all  have  the  same 
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Figure  10.11.  A  square  mesh  for  the  oval  plate. 


form.  Namely,  if  i  labels  an  interior  node,  then  the  corresponding  diagonal  entry  is  ku  =  4, 
while  the  off-diagonal  entries  fc- •  =  k-^  i  ^  j,  are  equal  to  —1  when  node  i  is  adjacent  to 
node  j  on  the  grid,  and  are  equal  to  0  in  all  other  cases.  Node  j  is  allowed  to  be  a  boundary 
node.  (Interestingly,  the  result  does  not  depend  on  how  one  orients  the  pair  of  triangles 
making  up  each  square  of  the  grid,  which  plays  a  role  only  in  the  computation  of  the  right- 
hand  side  of  the  finite  element  equation.)  Observe  that  the  same  computation  applies  even 
to  our  coarse  triangnlation.  The  interior  node  2  belongs  to  all  right  isosceles  triangles,  and 
the  corresponding  nonzero  entries  in  (10.50)  are  k22  =  4  and  k21  =  k23  =  k24  =  k2Q  =  —  1, 
indicating  the  four  adjacent  nodes. 

Remark :  The  coefficient  matrix  constructed  from  the  finite  element  method  on  a 
square  (or  even  rectangular)  grid  is  the  same  as  the  coefficient  matrix  arising  from  a 
finite  difference  solution  to  the  Laplace  or  Poisson  equation,  as  described  in  Example  5.7. 
The  finite  element  approach  has  the  advantage  of  readily  adapting  to  much  more  general 
discretizations  of  the  domain,  and  is  not  restricted  to  rectangular  grids. 


The  Coefficient  Vector  and  the  Boundary  Conditions 


So  far,  we  have  been  concentrating  on  assembling  the  finite  element  coefficient  matrix  K. 
We  also  need  to  compute  the  forcing  vector  b  =  ( 61, 62, . . . ,  bn  )  appearing  on  the  right- 
hand  side  of  the  fundamental  linear  equation  (10.41).  According  to  (10.39),  the  entries  bi 
are  found  by  integrating  the  product  of  the  forcing  function  and  the  finite  element  basis 
function.  As  before,  we  will  approximate  the  integral  over  the  domain  12  by  an  integral 
over  the  triangles,  and  so 

bi  =  JJn  f^x,y^tpi^x,y^dxdy  ~  X]  JJT  f(x'y)ui{x,y)dxdy  =  '^2,bvi.  (10.52) 

Typically,  an  exact  computation  of  the  various  triangular  double  integrals  is  not  so 
convenient,  and  so  we  resort  to  a  numerical  approximation.  Since  we  are  assuming  that 
the  individual  triangles  are  small,  we  can  get  away  with  a  very  crude  numerical  integration 
scheme.  If  the  function  f(x,y)  does  not  vary  much  over  the  triangle  Tv  —  which  will 
certainly  be  the  case  if  Tv  is  sufficiently  small  —  we  may  approximate  f(x,y)  c\  for 
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Figure  10.12.  Finite  element  tetrahedron. 


(x,y)  £  Tv  by  a  constant.  The  integral  (10.52)  is  then  approximated  by 


j 

1  f(x,y)uj"(x,y)dxdy  «  c\  , 

1  /  w”  (x,  y)  dxdy=\  c\  area  Tv=\  c\  \  | 

J 

Jt„  J 

JTV 

(10.53) 

The  formula  for  the  integral  of  the  affine  element  u\{x,y)  follows  from  solid  geometry:  it 
equals  the  volume  under  its  graph,  a  tetrahedron  of  height  1  and  base  X^,  as  illustrated  in 
Figure  10.12. 

How  to  choose  the  constant  c\l  In  practice,  the  simplest  choice  is  to  let  c \  —  f{xilyi) 
be  the  value  of  the  function  at  the  zth  vertex.  With  this  choice,  the  sum  in  (10.52)  becomes 

~  X  area  Tv  =  5  /(^,  yj  area  P{,  (10.54) 

V 

where  Pi  is  the  vertex  polygon  (10.37)  corresponding  to  the  node  x^.  In  particular,  for  the 
square  mesh  with  the  uniform  choice  of  triangles,  as  in  the  first  plot  in  Figure  10.8, 

area  Pi  =  3h2  for  all  i,  and  so  bi  f{xi^yi)  h 2  (10.55) 

is  well  approximated  by  just  ti2  times  the  value  of  the  forcing  function  at  the  node.  This 
is  the  underlying  reason  to  choose  the  uniform  triangulation  for  the  square  mesh;  the 
alternating  version  would  give  unequal  values  for  the  bi  over  adjacent  nodes,  and  this  could 
give  rise  to  unnecessary  errors  in  the  final  approximation. 

Example  10.8.  For  the  coarsely  triangulated  oval  plate,  the  reduced  stiffness  matrix 
is  (10.51).  The  Poisson  equation 

—  A  u  =  4 

models  a  constant  external  heat  source  of  magnitude  4°  over  the  entire  plate.  If  we  keep  the 
edges  of  the  plate  fixed  at  0°,  then  we  need  to  solve  the  finite  element  equation  K c  =  b, 
where  K  is  the  coefficient  matrix  (10.51).  The  entries  of  b  are,  by  (10.54),  equal  to  4  (the 
right-hand  side  of  the  differential  equation)  times  one-third  the  area  of  the  corresponding 
vertex  polygon,  which  for  node  2  is  the  square  consisting  of  four  right  triangles,  each  of 
area  |,  whereas  for  nodes  1  and  3  it  consists  of  four  right  triangles  of  area  |  plus  three 
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Figure  10.13.  Finite  element  solutions  to  Poisson’s  equation  for  an  oval  plate, 
equilateral  triangles,  each  of  area  see  Figure  10.10.  Thus, 

rp 

b  =  |  ( 2  +  ^,  2,  2  +  ^J  =  (4.3987, 2.6667, 4.3987  f  . 

The  solution  to  the  final  linear  system  K  c  =  b  is  easily  found: 

c  =  (1.5672, 1.4503, 1.5672  f  . 

Its  entries  are  the  values  of  the  finite  element  approximation  at  the  three  interior  nodes. 
The  piecewise  affine  finite  element  solution  is  plotted  in  the  first  illustration  in  Figure  10.13. 
A  more  accurate  approximation,  based  on  a  square  grid  triangnlation  of  size  h  —  .  1,  appears 
in  the  second  figure.  Here,  the  largest  errors  are  concentrated  near  the  poorly  approximated 
corners  of  the  oval,  and  could  be  improved  by  a  more  sophisticated  triangulation. 


Inhomogeneous  Boundary  Conditions 

So  far,  we  have  restricted  our  attention  to  problems  with  homogeneous  Dirichlet  bound¬ 
ary  conditions.  According  to  Theorem  9.32,  the  solution  to  the  inhomogeneous  Dirichlet 
problem 

—  A  u  —  f  in  12,  u  —  h  on  <912, 

is  also  obtained  by  minimizing  the  Dirichlet  functional  (9.82).  However,  now  the  mini¬ 
mization  takes  place  over  the  set  of  functions  that  satisfy  the  inhomogeneous  boundary 
conditions.  It  is  not  difficult  to  fit  this  problem  into  the  finite  element  scheme. 

The  elements  corresponding  to  the  interior  nodes  of  our  triangnlation  remain  as  before, 
but  now  we  need  to  include  additional  elements  to  ensure  that  our  approximation  satisfies 
the  boundary  conditions.  Note  that  if  is  a  boundary  node,  then  the  corresponding 
boundary  element  (pL(x,y)  satisfies  (10.28),  and  so  has  the  same  piecewise  affine  form 
(10.36).  The  corresponding  finite  element  approximation 

m 

w(x>  y)  =  N  y)  (io.56) 

1  =  1 

has  the  same  form  as  before,  (10.29),  but  now  the  sum  is  over  all  nodes,  both  interior 
and  boundary.  As  before,  the  coefficients  ct  =  w(xl,yl)  ~  u(xl:yt)  are  the  values  of  the 
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finite  element  approximation  at  the  nodes.  Therefore,  in  order  to  satisfy  the  boundary 
conditions,  we  require 

c,  =  hj  =  h(Xj,yj)  whenever  x  -  =  (x  - ,  y  - )  is  a  boundary  node.  (10.57) 

If  the  boundary  node  x  -  does  not  lie  precisely  on  the  boundary  di 1,  then  h{x-^y-)  is  not 
dehned,  and  so  we  need  to  approximate  the  value  hj  appropriately,  e.g.,  using  the  value  of 
h(x,y)  at  a  nearby  boundary  point  (x,y)  G  <9fL 

The  derivation  of  the  finite  element  equations  proceeds  as  before,  but  now  there  are 
additional  terms  arising  from  the  nonzero  boundary  values.  Leaving  the  intervening  details 
to  Exercise  10.3.23,  the  final  outcome  can  be  written  as  follows.  Let  K  denote  the  full  mxm 
finite  element  matrix  constructed  as  above.  The  reduced  coefficient  matrix  K  is  obtained 
by  retaining  the  rows  and  columns  corresponding  to  only  interior  nodes,  and  so  will  have 
size  n  x  n,  where  n  is  the  number  of  interior  nodes.  The  boundary  coefficient  matrix  K  is 
the  n  x  (m  —  n)  matrix  consisting  of  those  entries  of  the  interior  rows  that  do  not  appear 
in  iG,  i.e.,  those  lying  in  the  columns  indexed  by  the  boundary  nodes.  For  instance,  in  the 
coarse  triangnlation  of  the  oval  plate,  the  full  finite  element  matrix  is  given  in  (10.50),  and 
the  upper  3x3  subblock  is  the  reduced  matrix  (10.51).  The  remaining  entries  of  the  first 
three  rows  form  the  boundary  coefficient  matrix 

/  0  -.7887  -.5774  -.5774  -.7887  0  0  0  0  0  \ 

K=  -1  0  0  0  0  -1  0  0  0  0  . 

\  0  0  0  0  0  0  -.7887  -.5774  -.5774  -.7887/ 

(10.58) 

We  similarly  split  the  coefficients  ci  of  the  finite  element  function  (10.56)  into  two  groups. 
We  let  c  =  ( c1?  c2, . . . ,  cn  )  G  Mn  denote  the  as  yet  unknown  coefficients  corresponding  to 

the  values  of  the  approximation  at  the  interior  nodes  xi5  while  h  =  ( /q,  /i2, . . . ,  hrn_n  )  G 
Mm-n  will  be  the  vector  containing  the  boundary  values  (10.57).  The  solution  to  the  finite 
element  approximation  (10.56)  is  then  obtained  by  solving  the  associated  linear  system 

iGc  +  iGh  =  b,  or,  equivalently,  Kc  =  t  =  h  —  Kh.  (10.59) 

Example  10.9.  For  the  oval  plate  discussed  in  Example  10.7,  suppose  the  right-hand 
semicircular  edge  is  held  at  10°,  the  left-hand  semicircular  edge  at  —10°,  while  the  two 
straight  edges  have  a  linearly  varying  temperature  distribution  ranging  from  —10°  at  the 
left  to  10°  at  the  right,  as  illustrated  in  Figure  10.14.  Our  task  is  to  compute  its  equilibrium 
temperature,  assuming  no  internal  heat  source.  Thus,  for  the  coarse  triangulation  we  have 
the  boundary  node  values 

h  =  ( hi, . . . ,  h13  )T  =  ( 0,  -10,  -10,  -10,  -10, 0, 10, 10, 10, 10  )T  . 

Using  the  previously  computed  formulas  (10.51,  58)  for  the  interior  and  boundary  coefficient 
matrices  iG,  iG,  we  approximate  the  solution  to  the  Laplace  equation  by  solving  (10.59).  We 
are  assuming  that  there  is  no  external  forcing  function,  f(pc,y)  =  0,  and  hence  b  =  0,  and 
so  we  must  solve  iGc  =  f  =  — iGh  =  (  2.1856,  3.6,  7.6497 )T.  The  finite  element  function 
corresponding  to  the  solution  c  =  (  1.0679, 1.8,  2.5320  )T  is  plotted  in  the  first  illustration  in 
Figure  10.14.  Even  on  such  a  coarse  mesh,  the  approximation  is  not  too  bad,  as  evidenced 
by  the  second  illustration,  which  plots  the  finite  element  solution  for  the  finer  square  mesh 
of  Figure  10.11. 
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Figure  10.14.  Solution  to  the  Dirichlet  problem  for  the  oval  plate. 


Exercises 

10.3.14.  Consider  the  Dirichlet  boundary  value  problem  A u  =  0,  u(x,  0)  =  sinx,  =  0, 

u(0,y)  =  0,  u(n,y)  =  0,  on  the  square  S  =  {0  <  x,  y  <  tt } .  (a)  Find  the  exact  solution, 
(b)  Set  up  and  solve  the  finite  element  equations  based  on  a  square  mesh  with  n  —  2 
squares  on  each  side  of  S.  Write  out  the  reduced  finite  element  matrix,  the  boundary  co¬ 
efficient  matrix,  and  the  value  of  your  approximation  at  the  middle  of  the  unit  square.  How 
close  is  this  value  to  the  exact  solution  there?  (c)  Repeat  part  (b)  for  n  —  4  squares  per 
side.  Is  the  value  of  your  approximation  at  the  center  of  the  unit  square  closer  to  the  true 
solution?  (d)  Use  a  computer  to  find  a  finite  element  approximation  to  \  i r)  using 

n  =  8  squares  per  side.  Is  your  approximation  converging  to  the  exact  solution  as  the  mesh 
becomes  finer  and  finer? 

£  10.3.15.  Approximate  the  solution  to  the  Dirichlet  problem  A u  —  0,  u(x,  0)  =  x,  u(x,  1)  = 

1  —  x,  a(0,  y)  =  y,  u(  1,  y)  =  1  —  y,  by  use  of  finite  elements  with  mesh  sizes  Ax  =  Ay  =  .25 
and  .1.  Compare  your  approximations  with  the  solution  you  obtained  in  Exercise  4.3.12(d). 
What  is  the  maximal  error  at  the  nodes  in  each  case? 

10.3.16.  A  metal  plate  has  the  shape  of  an  equilateral  triangle  with  unit  sides.  One  side  is 
heated  to  100°,  while  the  other  two  are  kept  at  0°.  In  order  to  approximate  the  equilibrium 
temperature  distribution,  the  plate  is  divided  into  smaller  equilateral  triangles,  with  n 
triangles  on  each  side,  and  the  corresponding  finite  element  approximation  is  then 
computed,  (a)  How  many  triangles  are  in  the  triangulation?  How  many  interior  nodes? 

How  many  edge  nodes?  (b)  For  n  =  2,  set  up  and  solve  the  finite  element  linear  system 

to  find  an  approximation  to  the  temperature  at  the  center  of  the  triangle,  (c)  Answer  part 
(b)  when  n  =  3.  (d)  Use  a  computer  to  find  the  finite  element  approximation  to  the  tem¬ 
perature  at  the  center  when  n  =  5, 10,  and  15.  Are  your  values  converging  to  the  actual 
temperature?  (e)  Plot  the  finite  element  approximations  you  constructed  in  the  previous 
parts. 

10.3.17.  Find  the  equilibrium  temperature  distribution  in  a  unit  equilateral  triangle  when  one 
side  is  heated  to  100°,  while  the  other  two  are  insulated. 

4b  10.3.18.  A  metal  plate  has  the  shape  of  a  3  cm  square  with  a  1  cm  square  hole  cut  out  of  the 
middle.  The  plate  is  heated  by  fixing  the  inner  edge  at  temperature  100°  while  keeping 
the  outer  edge  at  0°.  (a)  Find  the  (approximate)  equilibrium  temperature  using  finite 
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elements  with  a  mesh  width  of  Ax  =  Ay  =  .5  cm.  Plot  your  approximate  solution  using 
a  three-dimensional  graphics  program,  (b)  Let  C  denote  the  square  contour  lying  midway 
between  the  inner  and  outer  square  boundaries  of  the  plate.  Using  your  finite  element 
approximation,  at  what  point(s)  on  C  is  the  temperature  a  (z)  minimum?  (zz)  maximum? 
(zzz)  equal  to  50°,  the  average  of  the  two  boundary  temperatures?  (c)  Repeat  part  (a) 
using  a  smaller  mesh  width  of  h  =  .2.  How  much  does  this  affect  your  answers  in  part  (b)? 


X  10.3.19.  Answer  Exercise  10.3.18  when  the  plate  is  additionally  subjected  to  a  constant  heat 
source  f(x,  y)  =  600 x  +  800  y  —  2400. 


10.3.20.  (a)  Construct  a  finite  element  approximation  to  the  solution,  using  a  maximal  mesh 
size  of  .1,  to  the  following  boundary  value  problem  on  the  unit  disk: 

1,  x2+y2  =  l,  y  >  0, 

0,  x2  +  y2  =  1,  y  <  0. 

(b)  Compare  your  solution  with  the  exact  solution  given  in  Example  4.7. 


Au  =  0. 


2  .  2 

x  +  y  <  1. 


u 


£  10.3.21.  (a)  Use  finite  elements  to  approximate  the  solution  to  the  boundary  value  problem 
—  Au  +  u  =  0,  0  <  x,  y  <  1,  u(x,  0)  =  u(x,  1)  =  u(0,  y)  =  0,  u(  1,  y)  =  1. 

(b)  Compare  your  result  with  the  first  5  and  10  summands  in  the  series  solution  obtained 
via  separation  of  variables. 


^  10.3.22.  (a)  Justify  the  construction  of  the  finite  element  matrix  for  a  square  mesh  described 
in  the  text,  (b)  How  would  you  modify  the  matrix  for  a  rectangular  mesh,  as  in  Exercise 
10.3.9? 


^  10.3.23.  Justify  the  inhomogeneous  finite  element  construction  in  the  text. 


C  10.3.24.  (a)  Explain  how  to  adapt  the  finite  element  method  to  a  mixed  boundary  value  prob¬ 
lem  with  inhomogeneous  Neumann  conditions,  (b)  Apply  your  method  to  the  problem 

du 

0,  —  (x,0)  =  x,  u(x,  1)  =  0,  u(0,  y)  =  0,  u(l,y) 


Au 


dy 


0. 


(c)  Solve  the  boundary  value  problem  via  separation  of  variables.  Compare  the  values  of 
your  solutions  at  the  center  of  the  square. 


10.4  Weak  Solutions 

An  alternative  route  to  the  finite  element  method,  which  avoids  the  requirement  of  a 
minimization  principle,  rests  upon  the  notion  of  a  weak  solution  to  a  differential  equation 
-  a  concept  of  considerable  independent  interest,  since  it  includes  many  of  the  nonclassical 
solutions  that  we  encountered  earlier  in  this  book.  In  particular,  the  discontinuous  shock 
waves  of  Section  2.3  are,  in  fact,  weak  solutions  to  the  nonlinear  transport  equation,  as  are 
the  continuous  but  only  piecewise  smooth  solutions  to  the  wave  equation  that  resulted  from 
applying  d’Alembert’s  formula  to  nonsmooth  initial  data.  Weak  solutions  have  become  an 
incredibly  powerful  idea  in  the  modern  theory  of  partial  differential  equations,  and  we  have 
space  to  present  only  the  very  basics  here.  They  are  particularly  appropriate  in  the  study 
of  discontinuous  and  nonsmooth  physical  phenomena,  including  shock  waves,  cracks  and 
dislocations  in  elastic  media,  singularities  in  liquid  crystals,  and  so  on.  In  the  mathematical 
analysis  of  partial  differential  equations,  it  is  often  easier  to  prove  the  existence  of  a  weak 
solution,  for  which  one  can  then  try  to  establish  sufficient  smoothness  in  order  that  it 
qualify  as  a  classical  solution.  Further  developments  along  with  a  range  of  applications  can 
be  found  in  more  advanced  texts,  including  [38,  44,  61,  99, 107,  122  . 
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Weak  Formulations  of  Linear  Systems 


The  key  idea  behind  the  concept  of  a  weak  solution  begins  with  a  rather  trivial  observation: 
the  only  element  in  an  inner  product  space  that  is  orthogonal  to  every  other  element  is  the 
zero  element. 


Lemma  10.10.  Let  V  be  an  inner  product  space  with  inner  product ^  ((•,•)).  An 
element  v*  E  V  satisfies  (( v*  ,  v ))  =0  for  all  v  E  V  if  and  only  if  =  0. 


Proof :  In  particular,  v*  must  be  orthogonal  to  itself,  so  0  =  (( ,  v* )) 
which  immediately  implies  v*  =  0. 


Q.E.D. 


Thus,  one  method  of  solving  a  linear 
write  it  in  the  form 


i(F[u] 


v}}  =  0 


or  even  nonlinear 
for  all  v  E  V, 


equation  F[u]  =  0  is  to 


(10.60) 


where  V  is  the  target  space  of  F:  U  V.  In  particular,  for  an  inhomogeneous  linear 
system,  L[u]  =  /,  with  L\  U  -E  V  a  linear  operator  between  inner  product  spaces,  the 
condition  (10.60)  takes  the  form 


0  =  ((L[u 


« L[u 


v ))-((/  , « )) 


for  all  v  E  V, 


or,  equivalently, 


u ,  L 


*r 


V 


—  ((  /  ,  v  ))  =  0  for  all  v  E  V , 


(10.61) 


where  L*:V  -E  [/  denotes  the  adjoint  of  the  operator  L,  as  defined  in  (9.2).  We  will  call 
(10.61)  the  weak  formulation  of  the  original  linear  system. 

So  far  we  have  not  really  done  anything  of  substance,  and,  indeed,  for  linear  systems 
of  algebraic  equations,  this  more  complicated  characterization  of  solutions  is  of  scant  help. 
However,  this  is  no  longer  the  case  for  differential  equations,  because,  thanks  to  the  integra¬ 
tion  by  parts  argument  used  to  determine  the  adjoint  operator,  the  solution  u  to  the  weak 
form  (10.61)  is  not  restricted  by  the  degree  of  smoothness  required  of  a  classical  solution. 
A  simple  example  will  illustrate  the  basic  construction. 


Example  10.11.  On  a  bounded  interval  a  <  x  <  6,  consider  the  elementary  bound¬ 
ary  value  problem 

d2u 

-  ~ry  =  /(#),  u(a)  =  u(b)  =  0. 

axz 


The  underlying  vector  space  is  U  —  {u{x)  E  C2[a,6]  |  u(a)  =  u(b )  =0}.  To  obtain  a 
weak  formulation,  we  multiply  the  differential  equation  by  a  test  function  v(x)  E  U  and 
integrate: 

/  —  u" (x)  —  f(x)]v(x)  dx  =  0.  (10.62) 

J  a 


The  left-hand  integral  can  be  identified  with  the  L2  inner  product  between  the  left-hand 
side  of  the  equation  L[u]  —  f  =  —u"  —  f  =  0  and  the  test  function  v.  According  to 
Lemma  10.10,  condition  (10.62)  holds  for  all  v(x)  E  U  if  and  only  if  u(x)  E  U  satisfies  the 


t  Shortly,  as  in  the  general  framework  developed  in  Chapter  9,  V  will  be  identified  as  the 
target  space  of  a  linear  operator  L:U  — V,  and  hence  the  choice  of  notation  for  its  inner  product. 
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boundary  value  problem.  However,  suppose  that  we  integrate  the  first  term  by  parts  once. 
The  boundary  conditions  on  v  imply  that  the  boundary  terms  vanish,  and  the  result  is 


u\x)  v\x )  —  f(x)  v{x) 


(10.63) 


A  function  u(x)  that  satisfies  the  latter  integral  condition  for  all  smooth  test  functions  v{x) 
will  be  called  a  weak  solution  to  the  original  boundary  value  problem.  The  key  observation 
is  that  the  original  differential  equation,  as  well  as  the  integral  reformulation  (10.62), 
requires  that  u{x)  be  twice  differentiable,  whereas  the  weak  version  (10.63)  requires  only 
that  its  first  derivative  be  defined. 

Of  course,  one  need  not  stop  at  (10.63).  Performing  another  integration  by  parts  on 
its  first  term  and  invoking  the  boundary  conditions  on  u  produces 


u(x)  v"(x)  —  f(x)  v{x) 


(10.64) 


Now  u(x)  need  only  be  (piecewise)  continuous  in  order  that  the  integral  be  defined 
keeping  in  mind  that  the  test  function  v{x)  is  still  required  to  be  smooth.  Equation  (10.64) 
is  sometimes  referred  to  as  the  fully  weak  formulation  of  the  boundary  value  problem,  while 
the  intermediate  integral  (10.63),  in  which  the  derivatives  are  evenly  distributed  among  u 
and  v,  is  then  known  as  the  semi- weak  formulation. 


Remark :  Recall  also  the  Definition  6.5  of  weak  convergence,  which  similarly  involves 
integrating  the  standard  convergence  criterion  against  a  suitable  test  function.  Both  are 
part  and  parcel  of  a  general  weak  analytical  framework  that  plays  an  essential  role  in  all 
of  modern  advanced  analysis,  including  partial  differential  equations. 


The  preceding  example  is  a  particular  case  of  a  general  construction  based  on  the 
abstract  formulation  of  self-adjoint  linear  systems  in  Chapter  9.  Let  L:U  -4  k  be  a  linear 
map  between  inner  product  spaces,  and  let  S  =  L*  o  L  :  U  -E  U  be  the  associated  self- 
adjoint  operator.  We  further  assume  that  kerL  =  {0},  which  implies  that  S  >  0  is  positive 
definite  and,  provided  /  E  rng  S,  the  associated  linear  system 


(10.65) 


has  a  unique  solution. 

In  order  to  construct  a  weak  formulation  of  the  linear  system  (10.65),  we  begin  by 
taking  its  inner  product  with  a  test  function  v  E  [/,  whereby 


0  =  ( S'f'U 


(#M  ,V  >  -  (f  ,v  ) 


( L*  o  L[u 


v)  -  (f,v). 


Integration  by  parts,  as  in  the  preceding  example,  amounts  to  moving  the  adjoint  operator 
so  that  it  acts  on  the  test  function  v,  and  in  this  manner  we  obtain  the  weak  formulation 


(( L[u]  ,  L[v]  ))  =  (/,  v )  for  all  v  E  [/,  (10.66) 

where  we  use  our  usual  notation  conventions  regarding  the  inner  products  on  U  and  V. 

Warning :  Unlike  the  minimization  principle  (10.1),  the  weak  formulation  (10.66)  does 
not  have  a  factor  of  |  on  the  left-hand  side.  Since,  in  the  applications  treated  here,  L  is 
a  differential  operator  of  order,  say,  fc,  the  weak  formulation  requires  only  that  u  E  Ck 
be  k  times  differentiable,  whereas,  since  S  has  order  2  fc,  the  classical  formulation  (10.65) 
requires  w  E  C2fc  to  have  twice  as  many  derivatives. 
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Similarly,  the  fully  weak  formulation  involves  an  additional  integration  by  parts,  real¬ 
ized  in  the  abstract  framework  by  moving  the  linear  operator  L  acting  on  u  so  as  to  act 
on  the  test  element  u,  and  so 


( u  ,  L*  o  L[v] )  =  ( u  ,  S[v] )  =  (  /  ,  v )  for  all  v  E  U.  (10.67) 


In  practice,  it  is  often  advantageous  to  restrict  the  class  of  test  functions  in  order  to 
avoid  technicalities  involving  smoothness  and  boundary  behavior.  This  requires  replacing 
the  simple  argument  used  to  establish  Lemma  10.10  by  a  more  sophisticated  result,  named 
after  the  nineteenth-century  German  analyst  Paul  du  Bois-Reymond. 

Lemma  10.12.  Let  f(x)  be  a  continuous  function  for  a  <  x  <b.  Then 


f(x)  v(x)  dx  =  0 


a 


for  every  C1  function  v(x)  with  compact  support  in  the  open  interval  (a,  b)  if  and  only  if 

f(x)  =  0. 


Proof :  Suppose  f(x0)  >  0  for  some  a  <  x0  <  b.  Then,  by  continuity,  f{x)>  0  for  all 
x  in  some  interval  a  <  x0  —  s  <  x  <  x0  +  e  <  b  around  x0.  Choose  v{x)  to  be  a  C1  function 
that  is  strictly  positive  in  this  interval  and  vanishes  outside.  An  example  is 


v(x) 

Then  f(x)  v{x)  >  0  when 


=  J  ((x-a :0f-e2)2 

0, 


np  _  rp 

tF  tF 


0 


< 


ry»  _  ry» 

tF  tF 


0 


otherwise. 

<  £  and  =  0  everywhere  else.  This  implies 


(10.68) 


b 

f(x)  v(x)  dx 


a 


* Xq+£ 


f{x)  v(x)  dx  >  0, 


X0-£ 


which  contradicts  the  original  assumption.  An  analogous  argument  rules  out  f(x0)  <  0  for 
some  a  <  x0  <  b.  Q.E.D. 


Finite  Elements  Based  on  Weak  Solutions 

To  characterize  weak  solutions,  one  imposes  the  appropriate  integral  criterion  on  the  entire 
infinite-dimensional  space  of  smooth  test  functions.  Thus,  an  evident  approximation  strat¬ 
egy  is  to  restrict  the  criterion  to  a  suitable  finite-dimensional  snbspace,  thereby  seeking  an 
approximate  weak  solution  that  belongs  to  the  subspace. 

More  precisely,  concentrating  on  the  self-adjoint  framework  discussed  at  the  end  of  the 
preceding  subsection,  we  restrict  the  weak  formulation  (10.66)  of  the  linear  system  (10.65) 
to  a  finite-dimensional  subspace  W  C  [/,  and  thus  seek  w  £  W  such  that 

(( L[w]  ,  L[v]  ))  =  (/,  v )  for  all  v  £  W.  (10.69) 

In  this  fashion,  we  characerize  the  finite  element  approximation  to  the  weak  solution  u  as 
the  element  w  E  W  such  that  (10.69)  holds  for  all  v  E  W. 

To  analyze  this  condition,  as  in  (10.3),  we  now  specify  a  basis  . . . ,  (pn  of  W,  and 
thus  can  write  both  w  and  v  as  linear  combinations  thereof: 

w  =  Ciip1+  ■■■  +cnifn,  v  =  d1<p1+  ■■■  +dn(fin. 
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Substituting  these  expressions  into  (10.69)  produces  the  bilinear  function 

n  n 

B{ c,d)=2Z  KjCidj  Mi  =  cTKd-bTd  =  (Kc-b)Td  =  0,  (10.70) 

i,j  =  l  i  =  1 

where 

kij  =  iLWi],L[Vj])),  bi  =  (f,<pi),  i,j  =  l,...,n,  (10.71) 

are  the  same  as  our  earlier  specifications  (10.6,7),  and  we  used  the  fact  that  KT  =  K 
is  a  symmetric  matrix  to  arrive  at  the  final  expression  in  (10.70).  The  condition  that 
(10.69)  hold  for  all  v  E  W  is  equivalent  to  the  requirement  that  (10.70)  hold  for  all 
d  =  ( d11  d2, . . . ,  dn  )  E  Mn,  which,  in  turn,  implies  that  c  =  (  c1?  c2, . . . ,  cn  )  must  satisfy 
the  linear  system 

ifc  =  b. 

But  we  immediately  recognize  that  this  is  exactly  the  same  as  the  finite  element  linear  sys¬ 
tem  (10.9)!  We  therefore  conclude  that  for  a  positive  definite  linear  system  constructed  as 
above ,  the  weak  finite  element  approximation  to  the  solution  is  the  same  as  the  minimizing 
finite  element  approximation.  In  other  words,  it  does  not  matter  whether  we  characterize 
the  solutions  through  the  minimization  principle  or  the  weak  reformulation;  the  resulting 
finite  element  approximations  are  exactly  the  same.  There  is  thus  no  need  to  present  any 
additional  examples  illustrating  this  construction. 

In  general,  while  the  weak  formulation  is  of  much  wider  applicability,  outside  of  bound¬ 
ary  value  problems  with  well-defined  minimization  principles,  the  rigorous  underpinning 
that  guarantees  that  the  numerical  solution  is  close  to  the  actual  solution  is  harder  to  estab¬ 
lish  and,  in  fact,  not  always  valid.  Indeed,  one  can  find  boundary  value  problems  without 
analytic  solutions  that  have  spurious  finite  element  numerical  solutions,  and,  conversely, 
boundary  value  problems  with  solutions  for  which  some  finite  element  approximations  do 
not  exist  because  the  resulting  coefficient  matrix  is  singular,  [113,  126]. 


Shock  Waves  as  Weak  Solutions 


Finally,  let  ns  return  to  our  earlier  analysis,  in  Section  2.3,  of  shock  waves,  but  now  in  the 
context  of  weak  solutions.  We  begin  by  writing  the  nonlinear  transport  equation  in  the 
conservative  form 


(10.72) 


Since  shock  waves  are  discontinuous  functions,  they  do  not  qualify  as  classical  solutions. 
However,  they  can  be  rigorously  characterized  as  weak  solutions,  a  formulation  that  will, 
reassuringly,  lead  to  the  Rankine-Hugoniot  Equal  Area  Rule  for  shock  dynamics. 

To  construct  a  weak  formulation  of  the  nonlinear  transport  equation,  we  follow  the 
general  framework,  and  hence  begin  by  multiplying  the  equation  (10.72)  by  a  smooth  test 
function  v(t,x)  and  integrating  over  a  domain  C  IR2: 


v(t1  x)  dt  dx  =  0. 


(10.73) 


As  a  direct  consequence  of  the  two-dimensional  version  of  the  dn  Bois-Reymond  Lemma, 
cf.  Exercise  10.4.7,  if  u(t,x)  E  C1  and  condition  (10.73)  holds  for  all  C1  functions  v(t,x) 
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Figure  10.15.  Integration  domain  for  weak  shock- wave  solution. 


with  compact  support  contained  in  12,  then  u(t,x)  is  necessarily  a  classical  solution  to  the 
partial  differential  equation  (10.72).  The  next  step  is  to  integrate  by  parts  in  order  to 
remove  the  derivatives  from  u,  and  this  is  accomplished  by  appealing  to  Green’s  formula 
(6.82),  which  we  rewrite  in  the  form 

IL  (u'^  +  u^)dtdx  =  L{'1'n)vd',~  IL  (lM+l£)vdtdX’  (10'74) 

where  u  =  (u1:u2)  -  In  our  case,  we  identify  the  integral  in  (10.73)  with  the  left-hand 
side  of  (10.74)  by  setting  ux  =  u,  u2  =  \u2 .  Since  v  has  compact  support,  the  boundary 
integral  vanishes,  and  thus  we  arrive  at  the  weak  formulation  of  the  equation. 

Definition  10.13.  A  function  u(t,x)  is  said  to  be  a  weak  solution  to  the  nonlinear 
transport  equation  (10.72)  on  12  C  IR2  if 


dv  t  9  dv 


dt 


dx 


dt  dx  =  0 


(10.75) 


for  all  C1  functions  v(t,x)  with  compact  support:  suppi;  C  12. 


The  key  point  is  that,  in  the  weak  formulation  (10.75),  the  derivatives  are  acting  solely 
on  v(t,  x),  which  we  assume  to  be  smooth,  and  not  on  our  prospective  solution  u(t,  x),  which 
now  need  not  even  be  continuous  for  the  integral  to  be  well  defined. 

Let  us  derive  the  Rankine-Hugoniot  shock  condition  (2.53)  as  a  consequence  of  the 
weak  formulation.  Suppose  u(t,x)  is  a  weak  solution,  defined  on  a  domain  12  C  IR2,  that 
has  a  single  jump  discontinuity  along  a  curve  C  parametrized  by  x  =  cr(t)  that  separates 
12  into  two  subdomains,  say  12 +  and  12_,  such  that  its  restriction  to  either  subdomain, 
denoted  by  u+  =  u  |  12 +  and  u_  —  u  |  12_,  are  each  classical  solutions  on  their  respective 
domains,  while  the  separating  curve  C  =  {x  =  a(t)  }  represents  a  shock- wave  discontinuity. 
For  specificity,  we  assume  that  12 +  lies  above  and  12_  lies  below  C  in  the  (£,  x)-plane;  see 
Figure  10.15.  Let  us  investigate  what  the  preceding  weak  formulation  implies  in  this 
situation.  We  split  the  integral  (10.75)  into  two  parts,  and  then  apply  the  integration 
by  parts  formula  (10.74)  to  each  individual  double  integral,  keeping  in  mind  that,  when 
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restricted  to  0+  or  f2_,  the  integrand  is  sufficiently  smooth  to  justify  application  of  the 
formula: 


Here 


0  = 


n 


n. 


dv  2  dv  *  7  7 
u  — — b  k  u  — —  I  at  ax 
at  z  ox 

dv  i  2  dv  *  7  7 
u  ,  —  +  77  u  ,  —  )  at  ax  + 
^  at  z  ^  dx 


8V  ■,  9  1  7  7 

u_  — — b  u_  — —  )  at  ax 
dt  z  dx 


du+  d 


dt  dx 


(D  (u+  •  n+)  v  ds  —  If 

J dQ-\-  J  JQ+ 

+  f  (u_  •  n_)  v  ds  —  j  j 
Jdn_  J _ 

/  (u+  •  n+  +  u_  •  n_)  v  ds. 

Jc 


+  £(K) 


v  dtdx  + 


^  +  |-(5”2-) 
dt  dx  x  z  7 


v  dt  dx 


u 


u 


+ 


+ 


2  a+ 


u 


u_ 

\u- 


while  n+,n_  are  the  unit  outwards  normals  on,  respectively,  dtt+  and  dVt_.  The  final 
equality  follows  from  the  fact  that  the  support  of  v  is  contained  strictly  inside  O,  and 
hence  vanishes  on  those  parts  of  the  boundaries  of  and  that  do  not  he  on  the 
curve  C.  In  particular,  since  C  is  the  graph  of  x  =  cr(t),  the  unit  normals  along  it  are, 
respectively, 


n 


+ 


do  \ 
dt  ’ 

-1/ 


? 


keeping  in  mind  our  convention  that  lies  above  and  O 


lies  below  C,  while 


Thus,  the  final  line  integral  reduces  to 


(10.76) 


Since  (10.76)  vanishes  for  all  C1  functions  v(t,x)  with  compact  support,  the  du  Bois- 
Reymond  Lemma  10.12  implies  that 


(u  —  Uj_)  ——  =  h  (u2  —  u2,)  on  C , 

v  “  +J  dt  2K  ~  +J 

thereby  re-establishing  the  Rankine-Hugoniot  shock  condition  (2.53).  The  upshot  is  that 
the  shock- wave  solutions  produced  in  Section  2.3  are  bona  fide  weak  solutions. 

Another  computation  shows  that  the  rarefaction  wave  (2.54)  also  qualifies  as  a  weak 
solution.  However,  so  does  the  non-physical  reverse  shock  solution  discussed  in  Exam¬ 
ple  2.11.  Thus,  although  the  weak  formulation  recovers  the  Rankine-Hugoniot  condition, 
it  does  not  address  the  problem  of  causality,  which  must  be  additionally  imposed  to  single 
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out  a  unique,  physically  meaningful  weak  solution.  Further  developments  of  these  ideas 
can  be  found  in  more  advanced  monographs,  e.g.,  [107,  122]. 


Exercises 


10.4.1.  Write  out  semi- weak  and  fully  weak  formulations  for  the  following  boundary  value 

problems:  (a)  —  u" -\- 2u  =  x  —  x2 ,  a(0)  =  u(l)  =  0; 

(b)  ex  u"  +  u  =  cos  x,  u  (0)  =  u  (2)  =  0;  (c)  xu" -\- u  +  xu  =  0,  u{l)  =  u{2)  =  0. 

10.4.2.  (a)  Write  down  a  weak  formulation  for  the  boundary  value  problem  —  u"  +  3a  =  x, 

u{ 0)  =  u(  1)  =  0.  (b)  Based  on  your  weak  formulation,  construct  a  finite  element 

approximation  to  the  solution,  using  n  =  10  nodes. 


10.4.3.  (a)  Write  down  a  weak  formulation  of  the  transport  equation  ut  +  3  ux 


line,  (b)  Solve  the  initial  value  problem  a(0,  x) 


1  —  |  x 

0, 


x 


<  1. 


otherwise. 


0  on  the  real 


(c)  Explain  why  the  result  of  part  (b)  is  not  a  classical  solution  to  the  wave  equation.  Is  it 
a  weak  solution  according  to  your  formulation  in  part  (a)? 


10.4.4.  (a)  Write  down  a  semi- weak  formulation  of  the  wave  equation  utt  —  4 uxx  on  the  real 
line,  (b)  Solve  the  initial  value  problem  a(0,  x)  =  p(x),  ut(0,x)  =  0,  where  the  initial 
displacement  is  a  ramp  function  (6.25).  (c)  Explain  why  the  result  of  part  (b)  is  not  a 

classical  solution  to  the  wave  equation.  Does  it  satisfy  the  semi-weak  formulation  of  part 
(a)?  Explain  your  answer. 


^  10.4.5.  (a)  Starting  with  the  nonlinear  transport  equation  written  in  the  alternative  conserva¬ 
tive  form  (2.56),  find  a  corresponding  weak  formulation. 

(b)  Prove  that  your  weak  formulation  produces  the  alternative  entropy  condition  (2.58)  for 
the  motion  of  a  shock  discontinuity. 


is 


^  10.4.6.  Prove  that  the  du  Bois-Reymond  Lemma  10.12  remains  valid  even  when  v(x)  E  Cc 
required  to  be  infinitely  differentiable. 

0  10.4.7.  The  Two-dimensional  du  Bois-Reymond  Lemma :  Let  H  C  l2  be  a  domain,  and  f(t,x) 
a  continuous  function  defined  thereon.  Prove  that  JJ  f(t,x)v(t,x)dtdx  =  0  for  every  C1 
function  v(t,x)  with  compact  support  in  D  if  and  only  if  f(t,x)  =  0. 


10.4.8.  (a)  Investigate  the  ability  of  finite  elements  to  approximate  a  solution  to  the 

non-positive-definite  boundary  value  problem  A u  +  Aw  =  0,  0  <  x  <  tt,  0  <  y  <  tt, 
u(x,  0)  =  1,  u(x,7r)  =  u(0,y)  =  u(7r,y)  =  0,  when  (z)  A  =  1,  (ii)  A  =  2.  Use  separation  of 
variables  to  find  a  series  solution  and  use  it  to  determine  the  accuracy  of  your  finite  element 
solution  in  part  (a). 


Chapter  11 

Dynamics  of  Planar  Media 


In  previous  chapters,  we  studied  the  equilibrium  configurations  of  planar  media  —  plates 
and  membranes  —  governed  by  the  two-dimensional  Laplace  and  Poisson  equations.  In 
this  chapter,  we  analyze  their  dynamics,  modeled  by  the  two-dimensional  heat  and  wave 
equations.  The  heat  equation  describes  diffusion  of,  say,  heat  energy  in  a  thin  metal  plate, 
an  animal  population  dispersing  over  a  region,  or  a  pollutant  spreading  out  into  a  shallow 
lake.  The  wave  equation  models  small  vibrations  of  a  two-dimensional  membrane  such  as  a 
drum.  Since  both  equations  fit  into  the  general  framework  for  dynamics  that  we  established 
in  Section  9.5,  their  solutions  share  many  of  the  general  qualitative  and  analytic  properties 
possessed  by  their  respective  one-dimensional  counterparts. 

Although  the  increase  in  dimension  may  tax  our  analytical  prowess,  we  have,  in  fact, 
already  mastered  the  principal  solution  techniques:  separation  of  variables,  eigenfunction 
series,  and  fundamental  solutions.  When  applied  to  partial  differential  equations  in  higher 
dimensions,  separation  of  variables  in  curvilinear  coordinates  often  leads  to  new  linear, 
but  non-constant-coefficient,  ordinary  differential  equations,  whose  solutions  are  no  longer 
elementary  functions.  Rather,  they  are  expressed  in  terms  of  a  variety  of  important  special 
functions ,  which  include  the  error  and  Airy  functions  we  encountered  earlier;  the  Bessel 
functions,  which  play  a  starring  role  in  the  present  chapter;  and  the  Legendre  and  Ferrers 
functions,  spherical  harmonics,  and  spherical  Bessel  functions  arising  in  three-dimensional 
problems.  Special  functions  are  ubiquitous  in  more  advanced  applications  in  physics,  chem¬ 
istry,  mechanics,  and  mathematics,  and,  over  the  last  two  hundred  and  fifty  years,  many 
prominent  mathematicians  have  devoted  significant  effort  to  establishing  their  fundamen¬ 
tal  properties,  to  the  extent  that  they  are  now,  by  and  large,  well  understood,  [86].  To 
acquire  the  requisite  familiarity  with  special  functions,  in  preparation  for  employing  them 
to  solve  higher-dimensional  partial  differential  equations,  we  must  first  learn  basic  series 
solution  techniques  for  linear  second-order  ordinary  differential  equations. 

11.1  Diffusion  in  Planar  Media 

As  we  learned  in  Chapter  4,  the  equilibrium  temperature  u(x,y)  of  a  thin,  uniform,  isotropic 
plate  is  governed  by  the  two-dimensional  Laplace  equation 

Au  =  uxx  +  uyy  =  0. 

Working  by  analogy,  the  dynamical  diffusion  of  the  plate’s  temperature  should  be  modeled 
by  the  two-dimensional  heat  equation 

ut=^Au  =  ^f(uxx+uyy).  (11.1) 
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The  coefficient  7  >  0,  assumed  constant,  measures  the  relative  speed  of  diffusion  of  heat 
energy  throughout  the  plate;  its  positivity  is  required  on  physical  grounds,  and  also  serves 
to  avoid  ill-posedness  inherent  in  running  diffusion  processes  backwards  in  time.  In  this 
model,  we  are  assuming  that  the  plate  is  uniform  and  isotropic,  and  experiences  no  loss  of 
heat  or  external  heat  sources  other  than  at  its  edge  —  which  can  be  arranged  by  covering 
its  top  and  bottom  with  insulation. 

The  solution  n(£,x)  =  u(t,x,y)  to  the  heat  equation  measures  the  temperature,  at 
time  t,  at  each  point  x  =  (ay  y)  in  the  (bounded)  domain  Od2  occupied  by  the  plate. 
To  uniquely  specify  the  solution  u(t,x,y),  we  must  impose  suitable  initial  and  boundary 
conditions.  The  initial  data  is  the  temperature  of  the  plate 

w(0,  x,  y )  =  f(x,  y ),  (x,  y)  G  f2,  (11.2) 

at  an  initial  time,  which  for  simplicity,  we  take  to  be  t0  =  0.  The  most  important  boundary 
conditions  are  as  follows: 

•  Dirichlet  boundary  conditions :  Specifying 

u  =  h  on  (11.3) 

fixes  the  temperature  along  the  edge  of  the  plate. 

•  Neumann  boundary  conditions :  Let  n  be  the  unit  outwards  normal  on  the  boundary 

of  the  domain.  Specifying  the  normal  derivative  of  the  temperature, 

du 

—  =  k  on  <9D,  (11.4) 

an 

effectively  prescribes  the  heat  flux  along  the  boundary.  Setting  k  =  0  corresponds 

to  an  insulated  boundary. 

•  Mixed  boundary  conditions :  More  generally,  we  can  impose  Dirichlet  conditions  on 

part  of  the  boundary  D  C  dfl  and  Neumann  conditions  on  its  complement  N  = 

<9D  \  D.  For  instance,  homogeneous  mixed  boundary  conditions 

c)u 

u  —  0  on  Z4,  — —  =  0  on  TV,  (11.5) 

an 

correspond  to  freezing  a  portion  of  the  boundary  and  insulating  the  remainder. 

•  Robin  boundary  conditions : 

(^Jr(3u  =  T  on  <9D,  (11.6) 

an 

where  the  edge  of  the  plate  sits  in  a  heat  bath  at  temperature  r. 

Under  reasonable  assumptions  on  the  domain,  the  initial  data,  and  the  boundary  data, 
a  general  theorem,  [34,  38,  99],  guarantees  the  existence  of  a  unique  solution  u(t,x,y)  to 
any  of  these  initial-boundary  value  problems  for  all  subsequent  times  t  >  0.  Our  practical 
goal  is  to  both  compute  and  understand  the  behavior  of  the  solution  in  specific  situations. 


Derivation  of  the  Diffusion  and  Heat  Equations 

The  physical  derivation  of  the  two-dimensional  (and  three-dimensional)  heat  equation  relies 
on  the  same  basic  thermodynamic  laws  that  were  used,  in  Section  4.1,  to  establish  the 
one-dimensional  version.  The  first  principle  is  that  heat  energy  flows  from  hot  to  cold  as 
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rapidly  as  possible.  According  to  multivariable  calculus,  [8,  108],  the  negative  temperature 
gradient  —  \7u  points  in  the  direction  of  the  steepest  decrease  in  the  temperature  function 
u  at  a  point,  and  so  heat  energy  will  flow  in  that  direction.  Therefore,  the  heat  flux  vector 
w,  which  measures  the  magnitude  and  direction  of  the  flow  of  heat  energy,  should  be 
proportional  to  the  temperature  gradient: 


w(£,  x,  y)  =  —  ft(x,  y )  Vn(t,  x,  y).  (11.7) 

The  scalar  quantity  K,(x,y)  >  0  measures  the  thermal  conductivity  of  the  material,  so 
(11.7)  is  the  multi-dimensional  form  of  Fourier’s  Law  of  Cooling  (4.5).  We  are  assuming 
that  the  thermal  conductivity  depends  only  on  the  position  (x,y)  E  S~2,  which  means  that 
the  material  in  the  plate 

(a)  is  not  changing  in  time; 

(b)  is  isotropic ,  meaning  that  its  thermal  conductivity  is  the  same  in  all  directions; 

(c)  and,  moreover,  its  thermal  conductivity  is  not  affected  by  any  change  in  temperature. 

Dropping  either  assumption  (b)  or  (c)  would  result  in  a  considerably  more  challenging 
nonlinear  diffusion  equation. 

The  second  thermodynamic  principle  is  that,  in  the  absence  of  external  heat  sources, 
heat  can  enter  any  subregion  R  C  O  only  through  its  boundary  OR.  (Keep  in  mind  that 
the  plate  is  insulated  from  above  and  below.)  Let  e(t,x,y)  denote  the  heat  energy  density 
at  each  time  and  point  in  the  domain,  so  that 


HR(t )  =  J  J  s(t,  x,y)  dx  dy 

represents  the  total  heat  energy  contained  within  the  subregion  R  at  time  t.  The  amount 
of  additional  heat  energy  entering  R  at  a  boundary  point  x  E  dR  is  given  by  the  normal 
component  of  the  heat  flux  vector,  namely  —  w  n,  where,  as  always,  n  denotes  the  outward 
unit  normal  to  the  boundary  dR.  Thus,  the  total  heat  flux  entering  the  region  R  is  ob¬ 
tained  by  integration  along  the  boundary  of  i?,  resulting  in  the  line  integral  —  (p  w  •  n  ds. 
Equating  the  rate  of  change  of  heat  energy  to  the  heat  flux  yields  ** dR 


dHR 

dt 


de 

dt 


(£,  x,  y)  dx  dy 


( p  w  •  n  ds  =  — 

/  /  V  •  w  dx  dy , 

IdR  J 

Jr 

where  we  applied  the  divergence  form  of  Green’s  Theorem,  (6.80),  to  convert  the  flux  line 
integral  into  a  double  integral.  Thus, 


JJ  ^  ^  +  V  •  dxdy  =  0.  (11.8) 

Keep  in  mind  that  this  result  must  hold  for  any  subdomain  R  C  O.  Now,  according  to 
Exercise  11.1.13,  the  only  way  in  which  an  integral  of  a  continuous  function  can  vanish  for 
all  subdomains  is  if  the  integrand  is  identically  zero,  and  so 

de 

—  +  V  •  w  =  0.  (11.9) 

dt  v  7 

In  this  manner,  we  arrive  at  the  basic  conservation  law  relating  the  heat  energy  density  e 
and  the  heat  flux  vector  w. 
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As  in  our  one-dimensional  model,  cf.  (4.3),  the  heat  energy  density  e(t,  x,  y)  is  propor¬ 
tional  to  the  temperature,  so 


s(t,x,y)  =  a(x,y)u(t,x,y),  where  a(x,  y)  =  p(x,  y)  x(x,  y) 


(11.10) 


is  the  product  of  the  density  p  and  the  specific  heat  capacity  \  °f  the  material  at  the 
point  (x,y)  G  O.  Combining  this  with  the  Fourier  Law  (11.7)  and  the  energy  balance 
equation  (11.10)  leads  to  the  general  two-dimensional  diffusion  equation 


du 

~dt 


1 

—  V  •  (  k  S7u) 

a  v  7 


(n.n) 


governing  the  thermodynamics  of  an  isotropic  medium  in  the  absence  of  external  heat 
sources  or  sinks.  In  full  detail,  this  second-order  partial  differential  equation  is 


du 


dt  <j(x,  y)  |_  dx 


d 


K(x,y) 


du 

dx 


+ 


d_ 

dy 


K.(x,y) 


du 

dy 


(11.12) 


Such  diffusion  equations  are  also  used  to  model  movements  of  populations,  e.g.,  bacte¬ 
ria  in  a  petri  dish  or  wolves  in  the  Canadian  Rockies,  [81,  84].  Here  the  solution  u(t,  x ,  y) 
represents  the  population  density  at  position  (x,  y)  at  time  t,  which  diffuses  over  the  do¬ 
main  due  to  random  motions  of  the  individuals.  Similar  diffusion  processes  model  the 
mixing  of  solutes  in  liquids,  with  the  diffusion  induced  by  the  random  Brownian  motion 
from  molecular  collisions.  More  generally,  diffusion  processes  in  the  presence  of  chemical 
reactions  and  convection  due  to  fluid  motion  are  modeled  by  the  more  general  class  of 
reaction- diffusion  and  convection- diffusion  equations ,  [107]. 

In  particular,  if  the  body  (or  the  environment  or  the  solvent)  is  uniform,  then  both 
a  and  u  are  constant,  and  so  (11.11)  reduces  to  the  heat  equation  (11.1)  with  thermal 
diffusivity 

K  K 

(11.13) 


u  u 
1  °  PX 


Both  the  heat  and  more  general  diffusion  equations  are  examples  of  parabolic  partial  dif¬ 
ferential  equations,  the  terminology  being  adapted  from  Definition  4.12  to  apply  to  partial 
differential  equations  in  more  than  two  variables.  As  we  will  see,  all  the  basic  qualitative 
features  of  solutions  to  the  one-dimensional  heat  equation  carry  over  to  parabolic  partial 
differential  equations  in  higher  dimensions. 

Indeed,  the  general  diffusion  equation  (11.12)  can  be  readily  fit  into  the  self-adjoint 
dynamical  framework  of  Section  9.5,  taking  the  form 


u,  =  —  V*  °  S7u. 


(H.!4) 


The  gradient  operator  V  maps  scalar  fields  u  to  vector  fields  v  =  Vn;  its  adjoint  V*,  which 
goes  in  the  reverse  direction,  is  taken  with  respect  to  the  weighted  inner  products 


(u,u)=  u(x,y)u(x,y)a(x,y)dxdy,  ((v,v))=//  v(x,y)  ■  v(x,y)  n(x,y)  dx  dy, 

J  Jn  J  Jn 

(11.15) 

between,  respectively,  scalar  and  vector  fields.  As  in  (9.33),  a  straightforward  integration 
by  parts  tells  us  that 


V*v  = - V  •  (k  v) 

( j 


1 


1 

a 


d(KvV  d(nv2) 


dx 


dy 


when 


v 


_  /  vi 
Vn 


(11.16) 
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Therefore,  the  right-hand  side  of  (11.14)  equals 

—  V*  o  \7u  =  —  V  •  (k,  Vn),  (11.17) 

G 

which  thereby  recovers  the  general  diffusion  equation  (11.11).  As  always,  the  validity  of 
the  adjoint  formula  (11.16)  rests  on  the  imposition  of  suitable  homogeneous  boundary 
conditions:  Dirichlet,  Neumann,  mixed,  or  Robin. 

In  particular,  to  obtain  the  heat  equation,  we  take  <j  and  k,  to  be  constant,  and  so 
the  inner  products  (11.15)  reduce,  up  to  a  constant  factor,  to  the  usual  L2  inner  products 
between  scalar  and  vector  fields.  In  this  case,  the  adjoint  of  the  gradient  is,  up  to  a  scale 
factor,  minus  the  divergence:  V*  =  —  yV-  ,  where  7  =  k/g.  In  this  scenario,  (11.14) 
reduces  to  the  two-dimensional  heat  equation  (11.1). 


Separation  of  Variables 


Let  us  now  discuss  analytical  solution  techniques.  According  to  Section  9.5,  the  separable 
solutions  to  any  linear  evolution  equation 


are  of  exponential  form 


(11.18) 


u(t,x,y)  =  e  xtv(x,y).  (11.19) 

Since  the  linear  operator  S  involves  differentiation  with  respect  to  only  the  spatial  variables 
x,  y,  we  obtain 


du 

dt 


=  —  A  e 


—  At 


v(x,y) 


while 


S[u ]  =  e  xt  *S[i;]. 


Substituting  back  into  the  diffusion  equation  (11.18)  and  canceling  the  exponentials,  we 
conclude  that 

S[i;]=Av.  (11.20) 


Thus,  v(x,y)  must  be  an  eigenfunction  for  the  linear  operator  S',  subject  to  the  relevant 
homogeneous  boundary  conditions. 

In  the  case  of  the  heat  equation  (11.1), 


=  —7  Ait, 


and  hence,  as  in  Example  9.40,  the  eigenvalue  equation  (11.20)  is  the  two-dimensional 
Helmholtz  equation 


7  Av  +  A  v  =  0, 


or,  in  detail, 


7 


d2v  d2v  \ 

i^  +  w)  + 


(n.21) 


According  to  Theorem  9.34,  self-adjointness  implies  that  the  eigenvalues  are  all  real  and 
nonnegative:  A  >  0.  In  the  positive  definite  cases  —  Dirichlet  and  mixed  boundary 
conditions  —  they  are  strictly  positive,  while  the  Neumann  boundary  value  problem  admits 
a  zero  eigenvalue  A0  =  0  corresponding  to  the  constant  eigenfunction  vQ(x,y)  =  1. 

Let  us  index  the  eigenvalues  in  increasing  order: 


0  *7  A-l  7  A2  7  Ag  7  •  •  • 


? 


(11.22) 
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repeated  according  to  their  multiplicities,  where  A0  =  0  is  an  eigenvalue  only  in  the  Neu¬ 
mann  case,  and  Xk  — )►  oo  as  k  — oo.  For  each  eigenvalue  Afc,  let  vk(x,  y )  be  an  independent 
eigenfunction.  The  corresponding  separable  solution  is 

uk(t,x,y)  =  e~Xktvk(x,y). 

Those  corresponding  to  positive  eigenvalues  are  exponentially  decaying  in  time,  while  a 
zero  eigenvalue  produces  a  constant  solution  u0(t,x,y)  =  1.  The  general  solution  to  the 
homogeneous  boundary  value  problem  can  then  be  built  up  as  an  infinite  series  in  these 
basic  eigensolutions 

oo  oo 

u(t,x,y)  =  Y^  ckuk(^x^y)  =  N  cke~Xkt  vk(x>y)-  (11.23) 

k  =  1  k  =  1 

The  coefficients  ck  are  prescribed  by  the  initial  conditions,  which  require 

oo 

ckvk(x,y)  =  f(x,y).  (11.24) 

k—1 

Since  S  is  self-adjoint,  Theorem  9.33  guarantees  orthogonality^  of  the  eigenfunctions  under 
the  L2  inner  product  on  the  domain  O: 


( vj  ,  vk  ) 


Vj(x,y)  vk(x,y)  dx  dy  =  0. 


3  +  k ■ 


(11.25) 


As  a  consequence,  the  coefficients  in  (11.24)  are  given  by  the  standard  orthogonality  formula 
(9.104),  namely 


c 


(f,v 


k 


V 


k 


f(x,y)  vk(x,y)  dx  dy 


vk{pc,y)2dx  dy 


(11.26) 


(For  the  more  general  diffusion  equation  (11.11),  one  uses  the  appropriately  weighted  inner 
product.)  The  exponential  decay  of  the  eigenfunction  coefficients  implies  that  the  resulting 
eigensolution  series  (11.23)  converges  and  thus  produces  the  solution  to  the  initial-boundary 
value  problem  for  the  diffusion  equation.  See  [34;  p.  369]  for  a  precise  statement  and  proof 
of  the  general  theorem. 


Qualitative  Properties 

Before  tackling  examples  in  which  we  are  able  to  construct  explicit  formulas  for  the  eigen¬ 
functions  and  eigenvalues,  let  us  see  what  the  eigenfunction  series  solution  (11.23)  can 
tell  us  about  general  diffusion  processes.  Based  on  our  experience  with  the  case  of  a  one¬ 
dimensional  bar,  the  final  conclusions  will  not  be  especially  surprising.  Indeed,  they  also 
apply,  word  for  word,  to  diffusion  processes  in  three-dimensional  solid  bodies.  A  reader  who 
is  impatient  to  see  the  explicit  formulas  may  wish  to  skip  ahead  to  the  following  section, 
returning  here  as  needed. 


^  As  usual,  in  the  case  of  a  repeated  eigenvalue,  one  chooses  an  orthogonal  basis  of  the 
associated  eigenspace  to  ensure  orthogonality  of  all  the  basis  eigenfunctions. 
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Keep  in  mind  that  we  are  still  dealing  with  the  solution  to  the  homogeneous  boundary 
value  problem.  The  first  observation  is  that  all  terms  in  the  series  solution  (11.23),  with  the 
possible  exception  of  a  null  eigenfunction  term  that  appears  in  the  semi-definite  Neumann 
case,  are  tending  to  zero  exponentially  fast.  Since  most  eigenvalues  are  large,  all  the  higher- 
order  terms  in  the  series  become  almost  instantaneously  negligible,  and  hence  the  solution 
can  be  accurately  approximated  by  a  finite  sum  over  the  first  few  eigenfunction  modes. 
As  time  goes  on,  more  and  more  of  the  modes  can  be  neglected,  and  the  solution  decays 
to  thermal  equilibrium  at  an  exponentially  fast  rate.  The  rate  of  convergence  to  thermal 
equilibrium  is,  for  most  initial  data,  governed  by  the  smallest  positive  eigenvalue  Ax  >  0 
for  the  Helmholtz  boundary  value  problem  on  the  domain. 

In  the  positive  definite  cases  of  homogeneous  Dirichlet  or  mixed  boundary  conditions, 
thermal  equilibrium  is  u(t,  x,  y)  — y)  =  0.  Here,  the  equilibrium  temperature  is  equal 
to  the  zero  boundary  temperature  —  even  if  this  temperature  is  fixed  on  only  a  small  part 
of  the  boundary.  The  initial  heat  is  eventually  dissipated  away  through  the  uninsulated 
part  of  the  boundary.  In  the  semi-definite  Neumann  case,  corresponding  to  a  completely 
insulated  plate,  the  general  solution  has  the  form 


oo 

u(t,x,y)=c0+  Y  cke~Xkt  vk(x,y),  (11.27) 

k=  1 

where  the  sum  is  over  the  positive  eigenmodes,  Xk  >  0.  Since  all  the  summands  are  expo¬ 
nentially  decaying,  the  final  equilibrium  temperature  u *  =  c0  is  the  same  as  the  constant 
term  in  the  eigenfunction  expansion.  We  evaluate  this  term  using  the  orthogonality  formula 
(11.26),  and  so,  as  t  -4  oo, 


u(t,x,y) 


■>  c0  — 


/,  1 


f(x,  y)  dx  dy 


dx  dy 


area  J  Jq 


f{pc,y)dxdy.  (11.28) 


We  conclude  that  the  equilibrium  temperature  is  equal  to  the  average  initial  temperature 
distribution.  Thus,  when  the  plate  is  fully  insulated,  the  heat  energy  cannot  escape,  and 
instead  redistributes  itself  in  a  uniform  manner  over  the  domain. 

Diffusion  has  a  smoothing  effect  on  the  initial  temperature  distribution  f(x,y).  As¬ 


sume  that  the  eigenfunction  coefficients  are  uniformly  bounded,  so 


c 


k 


<  M  for  some 


constant  M.  This  will  certainly  be  the  case  if  /(x,  y)  is  piecewise  continuous  or,  more  gen¬ 
erally,  belongs  to  L2,  since  Bessel’s  inequality,  (3.117),  which  holds  for  general  orthogonal 
systems,  implies  that  ck  — 0  as  k  — ?►  oo.  Many  distributions,  including  delta  functions, 
also  have  bounded  Fourier  coefficients.  Then,  at  any  time  t  >  0  after  the  initial  instant, 
the  coefficients  cke~Xkt  in  the  eigenfunction  series  solution  (11.23)  are  exponentially  small 
as  k  — )►  oo,  which  is  enough  to  ensure  smoothness  of  the  solution  u(t,  x,  y)  for  each  t  >  0. 
Therefore,  the  diffusion  process  serves  to  immediately  smooth  out  jumps,  corners,  and 
other  discontinuities  in  the  initial  data.  As  time  progresses,  the  local  variations  in  the  so¬ 
lution  become  less  and  less  pronounced,  as  it  asymptotically  reaches  a  constant  equilibrium 
state. 

As  a  result,  diffusion  processes  can  be  effectively  applied  to  smooth  and  denoise  planar 
images.  The  initial  data  n(0,  x,  y)  =  /(x,  y)  represents  the  gray-scale  value  of  the  image  at 
position  (x,  y),  so  that  0  <  /(x,  y)  <  1,  with  0  representing  black  and  1  representing  white. 
As  time  progresses,  the  solution  u(t,  x,  y)  represents  a  more  and  more  smoothed  version 
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Figure  11.1.  Smoothing  a  gray  scale  image. 


of  the  image.  Although  this  has  the  effect  of  removing  unwanted  high-frequency  noise, 
there  is  also  a  gradual  blurring  of  the  actual  features.  Thus,  the  “time”  or  “multiscale” 
parameter  t  needs  to  be  chosen  to  optimally  balance  between  the  two  effects  —  the  larger 
t  is  the  more  noise  is  removed,  but  the  more  noticeable  the  blurring.  A  representative 
illustration  appears  in  Figure  11.1.  The  blurring  affects  small-scale  features  first,  then, 
gradually,  those  at  larger  and  larger  scales,  until  eventually  the  entire  image  is  blurred  to 
a  uniform  gray.  To  further  suppress  undesirable  blurring  effects,  modern  image-processing 
filters  are  based  on  anisotropic  (and  thus  nonlinear )  diffusion  equations;  see  [100]  for  a 
survey  of  recent  progress  in  this  active  held. 

Since  the  forward  heat  equation  effectively  blurs  the  features  in  an  image,  we  might  be 
tempted  to  reverse  “time”  in  order  to  sharpen  the  image.  However,  the  argument  presented 
in  Section  4.1  tells  us  that  the  backwards  heat  equation  is  ill-posed,  and  hence  cannot  be 
used  directly  for  this  purpose.  Various  “regularization”  strategies  have  been  devised  to 
circumvent  this  mathematical  barrier,  and  thereby  design  effective  image  enhancement 
algorithms,  [46]. 


Inhomogeneous  Boundary  Conditions  and  Forcing 

Let  us  next  briefly  discuss  how  to  incorporate  inhomogeneous  boundary  conditions  and 
external  heat  sources  into  the  general  solution  framework.  Consider,  as  a  specific  example, 
the  forced  heat  equation 

ut  =  7  Au  +  F(x,  y)  for  (x,?/)eD,  (11.29) 

where  F(x,  y)  represents  an  unvarying  external  heat  source  or  sink,  subject  to  inhomoge¬ 
neous  Dirichlet  boundary  conditions 

u(x,y)  =  h(x,y)  for  (x,|/)  G  99,  (11.30) 

that  fixes  the  temperature  of  the  plate  on  its  boundary.  When  the  external  forcing  does 
not  vary  in  time,  we  expect  the  solution  to  eventually  settle  down  to  an  equilibrium  con¬ 
figuration:  u(t,  x ,  y)  -4  u^(x,  y)  as  t  — oo.  This  will  be  justified  below. 

The  time-independent  equilibrium  temperature  u^(x,  y)  satisfies  the  equation  obtained 
by  setting  ut  =  0  in  the  evolution  equation  (11.29),  which  reduces  it  to  the  Poisson  equation 

-  7  =  F  for  (x,i/)gO.  (11.31) 

The  equilibrium  solution  is  subject  to  the  same  inhomogeneous  Dirichlet  boundary  condi¬ 
tions  (11.30).  Positive  definiteness  of  the  Dirichlet  boundary  value  problem  implies  that 
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there  is  a  unique  equilibrium  solution,  which  can  be  characterized  as  the  sole  minimizer  of 
the  associated  Dirichlet  principle;  for  details  see  Section  9.3. 

With  the  equilibrium  solution  in  hand,  we  let 

v(t,  x,  y)  =  u(t,  x,  y)  -  w* (x,  y) 

measure  the  deviation  of  the  dynamical  solution  u  from  its  eventual  equilibrium.  By 
linearity  v(t,x,y)  satisfies  the  unforced  heat  equation  subject  to  homogeneous  boundary 
conditions: 

q  =  7  Ar,  (x,|/)gO,  v  =  0,  (x,y)  £  i 90.  (11.32) 

Therefore,  v  can  be  expanded  in  an  eigenfunction  series  (11.23),  and  will  decay  to  zero, 
v(t,x,y)  — 0,  at  an  exponentially  fast  rate  prescribed  by  the  smallest  eigenvalue  Ax  of 
the  associated  homogeneous  Helmholtz  boundary  value  problem.  (Special  initial  data  can 
decay  at  a  faster  rate,  prescribed  by  a  larger  eigenvalue.)  Consequently,  the  solution  to  the 
forced  inhomogeneous  problem  (11.29-30)  will  approach  thermal  equilibrium, 

u(t,x,y)  =v(t,x,y) +ui!(x,y)  — »  u+{x,y), 
at  exactly  the  same  exponential  rate  as  its  homogeneous  counterpart. 


The  Maximum  Principle 


Finally,  let  us  state  and  prove  the  (Weak)  Maximum  Principle  for  the  two-dimensional 
heat  equation.  As  in  the  one-dimensional  situation  described  in  Section  8.3,  it  states  that 
the  maximum  temperature  in  a  body  that  is  either  insulated  or  having  heat  removed  from 
its  interior  must  occur  either  at  the  initial  time  or  on  its  boundary.  Observe  that  there  are 
no  conditions  imposed  on  the  boundary  temperatures. 

Theorem  11.1.  Suppose  u(t,x,y)  is  a  solution  to  the  forced  heat  equation 


ut  =  7  A u  +  F(t,  x ,  y),  for  (x,  y)  £  £2,  0  <  t  <  c, 

where  O  is  a  hounded  domain ,  and  7  >  0.  Suppose  F(t,x,y)  <  0  for  all  (x,y)  £  O  and 
0  <  t  <  c.  Then  the  global  maximum  of  u  on  the  set  {  (t,  x,  ?/)  |  (x,y)  £  O,  0  <  t  <  c} 
occurs  either  when  t  =  0  or  at  a  boundary  point  (x,y)  £  <90. 

Proof :  First,  let  us  prove  the  result  under  the  assumption  that  F(t,x,y)  <  0 
everywhere.  At  a  local  interior  maximum,  ut  =  0,  and,  since  its  Hessian  matrix 

xy  J  must  be  negative  semi-definite,  both  diagonal  entries  uxx,uyy  <  0 
yy  / 

there.  This  would  imply  that  ut  —  7  A u  >0,  resulting  in  a  contradiction.  If  the  maximum 
were  to  occur  when  t  =  c,  then  ut  >  0  there,  and  also  uxx,uyy  <  0,  leading  again  to  a 
contradiction. 

To  generalize  to  the  case  F(£,  x,  y)  <  0,  which  includes  the  heat  equation  when 
F(t,  x ,  y)  =  0,  set 

u(£,  x,  y)  =  u(t,  x,  y)  +  £  (x2  +  y2),  where  e  >  0. 

Then, 

dv  ~ 

__  =7Au-47  e  +  F(t,x,y)  =  7  Av  +  F(t,  x,  y), 


V2u  = 


u 

u 


ry*  ry» 


xy 


U 

U 
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where 

F(t,x,y )  =  F(t,x,y)  -  A-fe  <  0. 

Thus,  by  the  previous  paragraph,  the  maximum  of  v  occurs  either  when  t  —  0  or  at  a 
boundary  point  (x,  y)  E  <90.  We  then  let  £  — 0  and  conclude  the  same  for  u.  More 
precisely,  let  u(t,  x,y)  <  M  on  t  =  0  or  (x,  y)  E  <90.  Then 


v(t,x,y)  <  M  +  C s,  where 
since  O  is  a  bounded  domain.  Thus, 


C  = 


max 


{ 


2  i  2 

x  +  y 


(x,  y)  E  <90  }  <  oo. 


u(t,  x ,  y)  <  v(t,  x,y)  <  M  +  C  e. 

Letting  e  -A  0  proves  that  u(t,  x,y)  <  M  at  all  (x,  y)  E  O,  0  <  t  <  c,  which  completes  the 
proof.  Q.E.D. 

Remark :  The  preceding  proof  can  be  readily  adapted  to  general  diffusion  equations 
(11.12)  —  assuming  that  the  coefficients  cr,  k,  remain  strictly  positive  throughout  the  do¬ 
main. 


Exercises 


11.1.1.  A  homogeneous,  isotropic  circular  metal  disk  of  radius  1  meter  has  its  entire  boundary 
insulated.  The  initial  temperature  at  a  point  is  equal  to  the  distance  of  the  point  from  the 
center.  Formulate  an  initial-boundary  value  problem  governing  the  disk’s  subsequent 
temperature  dynamics.  What  is  the  eventual  equilibrium  temperature  of  the  disk? 


11.1.2.  A  homogeneous,  isotropic,  circular  metal  disk  of  radius  2  cm  has  half  its  boundary  fixed 
at  100°  and  the  other  half  insulated.  Given  a  prescribed  initial  temperature  distribution, 
set  up  the  initial-boundary  value  problem  governing  its  subsequent  temperature  profile. 
What  is  the  eventual  equilibrium  temperature  of  the  disk?  Does  your  answer  depend  on 
the  initial  temperature? 

11.1.3.  Given  the  initial  temperature  distribution  f(x,y)  =  xy(l  —  x)(l  —  y)  on  the  unit  square 
D  =  {0<x,y<l},  determine  the  equilibrium  temperature  when  subject  to  homogeneous 
(a)  Dirichlet  boundary  conditions;  (b)  Neumann  boundary  conditions. 

11.1.4.  A  square  plate  with  side  lengths  1  meter  has  its  right  and  left  edges  insulated,  its  top 
edge  held  at  100°,  and  its  bottom  edge  held  at  0°.  Assuming  that  the  plate  is  made  out  of 
a  homogeneous,  isotropic  material,  formulate  an  appropriate  initial-boundary  value 
problem  describing  the  temperature  dynamics  of  the  plate.  Then  find  its  eventual  equilib¬ 
rium  temperature. 

11.1.5.  A  square  plate  with  side  lengths  1  meter  has  initial  temperature  5°  throughout,  and 
evolves  subject  to  the  Neumann  boundary  conditions  du/dn  =  1  on  its  entire  boundary. 
What  is  the  eventual  equilibrium  temperature? 


G  11.1.6.  Let  u(t,  x,  y)  be  a  solution  to  the  heat  equation  on  a  bounded  domain  D  subject  to 
homogeneous  Neumann  conditions  on  its  boundary  <9D.  (a)  Prove  that  the  total  heat 

H(t)  =  //  u(t,  x,y)  dx  dy  is  conserved,  i.e.,  is  constant  in  time,  (b)  Use  part  (a)  to 


prove  that  the  eventual  equilibrium  solution  is  everywhere  equal  to  the  average  of  the  initial 
temperature  u(0,x,y).  (c)  What  can  you  say  about  the  behavior  of  the  total  heat  for  the 
homogeneous  Dirichlet  boundary  value  problem?  (d)  What  about  an  inhomogeneous 
Dirichlet  boundary  value  problem? 


11.2  Explicit  Solutions  of  the  Heat  Equation 


445 


11.1.7.  Let  u(t,x,y)  be  a  nonconstant  solution  to  the  heat  equation  on  a  connected,  bounded 
domain  subject  to  homogeneous  Dirichlet  boundary  conditions  on  <911.  (a)  Prove  that  its 

L2  norm  N(t)  =  ^ u(t,x,y)2  dxdy  is  a  strictly  decreasing  function  of  t.  (b)  Is  this 
also  true  for  mixed  boundary  conditions?  (c)  For  Neumann  boundary  conditions? 


11.1.8.  Are  the  conclusions  in  Exercises  11.1.6  and  11.1.7  valid  for  the  general  diffusion 
equation  (11.12)? 

0  11.1.9.  Write  out  the  eigenvalue  equation  governing  the  separable  solutions  to  the  general 

diffusion  equation  (11.11),  subject  to  appropriate  boundary  conditions.  Given  a  complete 
system  of  eigenfunctions,  write  down  the  eigenfunction  series  solution  to  the  initial  value 
problem  u(0,x,y)  =  f(x,y),  including  the  formulas  for  the  coefficients. 


11.1.10.  True  or  false:  The  equilibrium  temperature  of  a  fully  insulated  nonuniform  plate  whose 
thermodynamics  are  governed  by  the  general  diffusion  equation  (11.12)  equals  the  average 
initial  temperature. 


11.1.11.  Let  a  >  0,  and  consider  the  initial-boundary  value  problem  ut  =  Au  —  au ,  u(0,x,y)  = 
/(x,  y)  on  a  bounded  domain  U  C  M2,  with  boundary  conditions  du/d n  =  0  on  <911. 

(a)  Write  the  equation  in  self-adjoint  form  (9.122).  Hint :  Look  at  Exercise  9.3.26. 

(b)  Prove  that  the  problem  has  a  unique  equilibrium  solution. 


11.1.12.  Write  each  of  the  following  linear  evolution  equations  in  the  self-adjoint  form  (9.122) 
by  choosing  suitable  inner  products  and  a  suitable  set  of  homogeneous  boundary  conditions. 
Is  the  operator  you  construct  positive  definite? 

(a)  ut  =  uxx  +uyy  -  u,  ( b )  ut  =  yuxx+xuyy,  (c)  ut  =  A  2u. 


<0>  11.1.13.  Prove  that  if  f(x,y)  is  continuous  and  JJ  f(x,y)  dxdy  =  0  for  all  R  C  fi,  then 
f(x,y)  =  0  for  (x,y)  G  H.  Hint :  Adapt  the  method  in  Exercise  6.1.23. 


11.2  Explicit  Solutions  of  the  Heat  Equation 

Solving  the  two-dimensional  heat  equation  in  series  form  requires  knowing  the  eigen¬ 
functions  for  the  associated  Helmholtz  boundary  value  problem.  Unfortunately,  as  with 
the  vast  majority  of  partial  differential  equations,  explicit  solution  formulas  are  few  and  far 
between.  In  this  section,  we  discuss  two  specific  cases  in  which  the  required  eigenfunctions 
can  be  found  in  closed  form.  The  calculations  rely  on  a  further  separation  of  variables, 
which,  as  we  know,  works  in  only  a  very  limited  class  of  domains.  Nevertheless,  interesting 
solution  features  can  be  gleaned  from  these  particular  geometries. 

The  first  example  is  a  rectangular  domain,  and  the  eigensolutions  can  be  expressed  in 
terms  of  elementary  functions  —  trigonometric  functions  and  exponentials.  We  then  study 
the  heating  of  a  circular  disk.  In  this  case,  the  eigenfunctions  are  no  longer  elementary 
functions,  but,  rather,  are  expressed  in  terms  of  Bessel  functions.  Understanding  their 
basic  properties  will  require  us  to  take  a  detour  to  develop  the  fundamentals  of  power 
series  solutions  to  ordinary  differential  equations. 

Heating  of  a  Rectangle 

A  homogeneous  rectangular  plate 

i?={0<x<a,  0  <  y  <  b} 
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is  heated  to  a  prescribed  initial  temperature, 

u(0,x,y)  =  f(x,y),  for  (x,  y)  G  R.  (11.33) 

Then  its  top  and  bottom  are  insulated,  while  its  sides  are  held  at  zero  temperature.  Our 
task  is  to  understand  the  thermodynamic  evolution  of  the  plate’s  temperature. 

The  temperature  u(t,x,  y )  evolves  according  to  the  two-dimensional  heat  equation 

ut=  l{uxx  +  uyy)i  for  C X,y)eR ,  t>  0,  (11.34) 

where  7  >  0  is  the  plate’s  thermal  diffusivity,  while  subject  to  homogeneous  Dirichlet 
conditions  along  the  boundary  of  the  rectangle  at  all  subsequent  times: 

u(t,  0,  y)  =  u(t,  a,  y)  =  u(t,  x,  0)  =  u(t,  x,  b)  =  0,  0  <  x  <  a,  0  <  y  <  6,  t  >  0. 

(11.35) 

As  in  (11.19),  the  eigensolutions  to  the  heat  equation  are  obtained  from  the  usual  expo¬ 
nential  ansatz  u(t,  x ,  y)  =  e~xt  v(x,  y).  Substituting  this  expression  into  the  heat  equation, 
we  conclude  that  the  function  v(x,y)  solves  the  Helmholtz  eigenvalue  problem 

7 (vxx  +  vyy)  +  \v  =  0,  (x,  y)  €  R,  (11.36) 

subject  to  the  same  homogeneous  Dirichlet  boundary  conditions: 

^(O,  y)  =  v(a,  y)  =  v(x,  0)  =  v(x,  b)  =  0,  0  <  x  <  a,  0  <  y  <  b.  (11.37) 

To  tackle  the  rectangular  Helmholtz  eigenvalue  problem  (11.36-37),  we  shall,  as  in 
(4.89),  introduce  a  further  separation  of  variables,  writing  the  solution 

v(x,y)  =  p(x)  q(y) 

as  the  product  of  functions  depending  on  the  individual  Cartesian  coordinates.  Substituting 
this  expression  into  the  Helmholtz  equation  (11.36),  we  find 

7 p"{x)  q(y)  +  7 p(x)  q"{y )  +  A p(x)  q(y)  =  0. 

To  effect  the  variable  separation,  we  collect  all  terms  involving  x  on  one  side  and  all  terms 
involving  y  on  the  other  side  of  the  equation,  which  is  accomplished  by  dividing  by  v  —  pq 
and  rearranging  the  terms: 


7 


p"(x) 

p(x) 


The  left-hand  side  of  this  equation  depends  only  on  x,  whereas  the  middle  term  depends 
only  on  y.  As  before,  this  requires  that  the  expressions  equal  a  common  separation  constant , 
denoted  by  —  p.  (The  minus  sign  is  for  later  convenience.)  In  this  manner,  we  reduce  our 
partial  differential  equation  to  a  pair  of  one-dimensional  eigenvalue  problems 


d2p 

7  -y-o  +  dP  =  0, 
dxz 


+  (A  —  fi)  q  =  0, 


each  of  which  is  subject  to  homogeneous  Dirichlet  boundary  conditions 


(11.38) 


p(  0)  =p(a)  =  0, 


q{  0)  =  q(b)  =  0, 


(11.39) 
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stemming  from  the  boundary  conditions  (11.37).  To  obtain  a  nontrivial  separable  solution 
to  the  Helmholtz  equation,  we  seek  nonzero  solutions  to  these  two  supplementary  eigenvalue 
problems. 

We  have  already  solved  these  particular  two  boundary  value  problems  (11.38-39)  many 
times;  see,  for  instance,  (4.21).  The  eigenfunctions  are,  respectively, 


Pm(X)  =sin 


mux 


a 


m  —  1,2,3,..., 


Q„(y)  =  sin 


nuy 


n—  1,  2,  3, . . . 


with 


fi  = 


2  2 

m  7 r  7 


A  —  /i  = 


2  2 
n  u  7 


so  that 


2  2  2  2 

m  u  7  n  7r  7 

A  = - ^  +  ■ 


a*  *  b2  a2  b2 

Therefore,  the  separable  eigenfunction  solutions  to  the  Helmholtz  boundary  value  problem 
(11.35-36)  have  the  doubly  trigonometric  form 


mux  nuy 

VmAX->y)  =  sm  -  sm 


a 


b 


for 


m,  n  =  1,  2,  3, 


(11.40) 


with  associated  eigenvalues 


2  2  2  2 
m  u  7  n  7T  7 

\  —  _ _  I  _ _ 

m,n  q2  '  fj2 


rrv 
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+ 


rv 


b2 
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(H.41) 


Each  of  these  corresponds  to  an  exponentially  decaying  eigensolution 
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m,n 


(x,  y )  =  exp 


rrv 


CL‘ 


+ 
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b2 


u2  7 1 


mux  nuy 

sm -  sm  — - — 

a  b 


(11.42) 

to  the  original  rectangular  Dirichlet  boundary  value  problem  for  the  heat  equation. 

Using  the  fact  that  the  univariate  sine  functions  form  a  complete  system,  it  is  not 
hard  to  prove,  [i20],  that  the  separable  eigenfunction  solutions  (11.42)  are  complete,  and 
so  there  are  no  non-separable  eigenfunctions."!'  As  a  consequence,  the  general  solution  to 
the  initial-boundary  value  problem  can  be  expressed  as  a  linear  combination 


oo 


oo 


r  P 
rri,n 


^■m  ,n  t 


vm,Vx^y) 


(11.43) 


u{t,  x,  y)  =  cmnumn{t,x,y)  =  J2 

m,n=  1  rri.n  =  1 

of  the  eigenmodes.  The  coefficients  cm  n  are  prescribed  by  the  initial  conditions,  which 
take  the  form  of  a  double  Fourier  sine  series 


oo 


oo 


f(x,y)  =  u(0,X,y)=  22  Cm,nVm,n(X,y)  =  22  C m, 


n 


m,n  =  1 


m,n  =  1 


mux  nuy 

sm -  sm  — - — 

a  b 


Self-adjointness  of  the  Laplacian  operator  coupled  with  the  boundary  conditions  im¬ 
plies  that*  the  eigenfunctions  vm  (x,  y)  are  orthogonal  with  respect  to  the  L2  inner  product 


*  This  appears  to  be  a  general  fact,  true  in  all  known  examples,  but  I  know  of  no  general 
proof.  Theorem  9.47  can  be  used  to  establish  completeness  of  the  eigenfunctions,  but  does  not 
guarantee  that  they  can  all  be  constructed  by  separation  of  variables. 

*  Technically,  orthogonality  is  guaranteed  only  when  the  eigenvalues  are  distinct:  Am  n  7^  A k  [. 
However,  by  a  direct  computation,  one  finds  that  orthogonality  continues  to  hold  even  when  the 
indicated  eigenfunctions  are  associated  with  equal  eigenvalues.  See  the  final  subsection  of  this 
chapter  for  a  discussion  of  when  such  “accidental  degeneracies”  arise. 
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on  the  rectangle: 

na 

vk,i(xiV)  vm,n(x’y">dxdy  =  0  unless  k  =  m  and  l  =  n. 

(The  skeptical  reader  can  verify  the  orthogonality  relations  directly  from  the  eigenfunction 
formulas  (11.40).)  Thus,  we  can  appeal  to  our  usual  orthogonality  formula  (11.26)  to 
evaluate  the  coefficients 
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c  — 

rn.n 


rri,n 
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m,n 


ab 


na 

f(x,y )  sin 


Unix  nuy 
-  sm  — - —  ax  ay , 
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(11.44) 


where  the  formula  for  the  norms  of  the  eigenfunctions 
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dxdy=jab  (11.45) 


follows  from  a  direct  evaluation  of  the  double  integral.  Unfortunately,  while  orthogonality 
is  (mostly)  automatic,  computation  of  the  norms  must  inevitably  be  done  “by  hand” . 

For  generic  initial  temperature  distributions,  the  rectangle  approaches  thermal  equi¬ 
librium  at  a  rate  equal  to  the  smallest  eigenvalue: 


a'’'=(44)'27' 

i.e.,  the  sum  of  the  reciprocals  of  the  squared  lengths  of  its  sides  multiplied  by  the  diffusion 
coefficient.  The  larger  the  rectangle,  or  the  smaller  the  diffusion  coefficient,  the  smaller  the 
value  of  \1  n  and  hence  the  slower  the  return  to  thermal  equilibrium.  The  exponentially 
fast  decay  rate  of  the  Fourier  series  implies  that  the  solution  immediately  smooths  out  any 
discontinuities  in  the  initial  temperature  profile.  Indeed,  the  higher  modes,  with  m  and 
n  large,  decay  to  zero  almost  instantaneously,  and  so  the  solution  quickly  behaves  like  a 
finite  sum  over  a  few  low-order  modes.  Assuming  that  cx  l  ^  0,  the  slowest-decaying  mode 
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in  the  Fourier  series  (11.43)  is 


ciiuii(t,x,y)  =  c1±  exp 


1  1 

a2  b2 


7T2  7  t 


7 ix  ny 

sm  —  sin  — — 
a  b 


(11.47) 


Thus,  in  the  long  run,  the  temperature  becomes  entirely  of  one  sign  —  either  positive 
or  negative  depending  on  the  sign  of  cx  1  —  throughout  the  rectangle.  This  observation 
is,  in  fact,  indicative  of  the  general  phenomenon  that  an  eigenfunction  associated  with 
the  smallest  positive  eigenvalue  of  a  self-adjoint  elliptic  operator  is  necessarily  of  one  sign 
throughout  the  domain,  [34].  A  typical  solution  is  plotted  at  several  times  in  Figure  11.2. 
Non-generic  initial  conditions,  with  cx  x  —  0,  decay  more  rapidly,  and  their  asymptotic 
temperature  profiles  are  not  of  one  sign. 


Exercises 


11.2.1.  A  rectangle  of  size  2  cm  by  1  cm  has  initial  temperature  f(x,y)  =  ski7rx  sin7ry  for 
0<x<2,  0  <  y  <  1.  All  four  sides  of  the  rectangle  are  held  at  0°.  Assuming  that  the 
thermal  diffusivity  of  the  plate  is  7  =  1,  write  down  a  formula  for  its  subsequent  tempera¬ 
ture  u(t,x,y).  What  is  the  rate  of  decay  to  thermal  equilibrium? 


11.2.2.  Solve  Exercise  11.2.1  when  the  initial  temperature  f(x,y)  is 

(a)  (b)  {  i  1  <  £  <  2;  (C)  4“ 


1 


X 


11.2.3.  Solve  the  initial-boundary  value  problem  for  the  heat  equation  ut  =  2  A u  on  the  rectan¬ 
gle  —  1  <  x  <  1,  0  <  y  <  1  when  the  two  short  sides  are  kept  at  0°,  the  two  long  sides  are 

f  —  1 ,  x  <C  0, 

insulated,  and  the  initial  temperature  distribution  is  u(0,x,y)  =  <  ’  ’  0  <  y  <  1. 

I  \  1 5  x  T'  U, 


11.2.4.  Answer  Exercise  11.2.3  when  the  two  long  sides  are  kept  at  0  and  the  two  short  sides 
are  insulated. 


T  11.2.5.  A  rectangular  plate  of  size  1  meter  by  3  meters  is  made  out  a  metal  with  unit  diffusiv¬ 
ity.  The  plate  is  taken  from  a  0°  freezer,  and,  from  then  on,  one  of  its  long  sides  is  heated 
to  100°,  the  other  is  held  at  0°,  while  its  top,  bottom,  and  both  of  the  short  sides  are  fully 
insulated,  (a)  Set  up  the  initial-boundary  value  problem  governing  the  time-dependent 
temperature  of  the  plate,  (b)  What  is  the  equilibrium  temperature?  (c)  Use  your  answer 
from  part  (b)  to  construct  an  eigenfunction  series  for  the  solution,  (d)  How  long  until  the 
temperature  of  the  plate  is  everywhere  within  1°  of  its  eventual  equilibrium? 

Hint:  Once  t  is  no  longer  small,  you  can  approximate  the  series  solution  by  its  first  term. 

11.2.6.  Among  all  rectangular  plates  of  a  prescribed  area,  which  one  returns  to  thermal  equi¬ 
librium  the  slowest  when  subject  to  Dirichlet  boundary  conditions?  The  fastest?  Use  your 
physical  intuition  to  explain  your  answer,  but  justify  it  mathematically. 

11.2.7.  Answer  Exercise  11.2.6  for  a  fully  insulated  rectangular  plate,  i.e.,  subject  to  Neumann 
boundary  conditions. 

O  11.2.8.  A  square  metal  plate  is  taken  from  an  oven,  and  then  set  out  to  cool,  with  its  top  and 
bottom  insulated.  Find  the  rate  of  cooling,  in  terms  of  the  side  length  and  the  thermal 
diffusivity,  if  (a)  all  four  sides  are  held  at  0°;  (b)  one  side  is  insulated  and  the  other  three 
sides  are  held  at  0°;  (c)  two  adjacent  sides  are  insulated  and  the  other  two  are  held  at  0°; 
(d)  two  opposite  sides  are  insulated  and  the  other  two  are  held  at  0°;  (e)  three  sides  are 
insulated  and  the  remaining  side  is  held  at  0°.  Order  the  cooling  rates  of  the  plates  from 
fastest  to  slowest.  Do  your  results  confirm  your  intuition? 
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T  11.2.9.  Two  square  plates  are  made  out  of  the  same  homogeneous  material,  and  both  are  ini¬ 
tially  heated  to  100°.  All  four  sides  of  the  first  plate  are  held  at  0°,  whereas  one  of  the 
sides  of  the  second  plate  is  insulated  while  the  other  three  sides  are  held  at  0°.  Which  plate 
cools  down  the  fastest?  How  much  faster?  Assuming  the  thermal  diffusivity  7  =  1,  how 
long  do  you  have  to  wait  until  every  point  on  each  plate  is  within  1°  of  its  equilibrium 
temperature?  Hint :  Once  t  is  no  longer  small,  the  series  solution  is  well  approximated  by 
its  first  term. 

C  11.2.10.  Multiple  choice :  On  a  unit  square  that  is  subject  to  Dirichlet  boundary  conditions,  the 
eigenvalues  of  the  Laplace  operator  are 

(a)  all  simple,  (b)  at  most  double,  or  (c)  can  have  arbitrarily  large  multiplicity. 

O  11.2.11.  The  thermodynamics  of  a  thin  circular  cylindrical  shell  of  radius  a  and  height  h,  e.g., 
the  side  of  a  tin  can  after  its  top  and  bottom  are  removed,  is  modeled  by  the  heat  equation 

du  (  1 
dt  =  7  (a5 

the  cylinder  at  time  t  >  0,  angle  —7t<0<tt,  and  height  0  <  z  <  h.  Keep  in  mind 
that  u(t,0,z)  must  be  a  27r-periodic  function  of  the  angular  coordinate  0.  Assume  that  the 
cylinder  is  everywhere  insulated,  while  its  two  circular  ends  are  held  at  0°.  Given  an  initial 
temperature  distribution  at  time  t  =  0,  write  down  a  series  formula  for  the  cylinder’s 
temperature  at  subsequent  times.  What  is  the  eventual  equilibrium  temperature? 

How  fast  does  the  cylinder  return  to  equilibrium? 

T  11.2.12.  Consider  the  initial-boundary  value  problem 

ut  =  uxx  +  uyy ,  u{ 0,  x,y)  =  0,  0  <  x,  y  <  7r,  t  >  0, 

for  the  heat  equation  in  a  square  subject  to  the  Dirichlet  conditions 

u{ 0,  y)  =  u(tt,  y)  =  0  =  u(x,  0),  u(x,  i r)  =  /(#),  0  <  x,  y  <  tt. 

Write  out  an  eigenfunction  series  formulas  for 

(a)  the  equilibrium  solution  u*(x,  y)  =  lim  u(t,x,y),  (b)  the  solution  u(t,  x,  y). 

t  — >  oo 

11.2.13.  Solve  Exercise  11.2.1  when  one  long  side  of  the  plate  is  held  at  100°. 

Hint:  See  Exercise  11.2.12. 


d2u  d2u 

W  +  lte2 


in  which  u(t,  0 ,  z)  measures  the  temperature  of  the  point  on 


Heating  of  a  Disk  —  Preliminaries 


Let  us  perform  a  similar  analysis  of  the  thermodynamics  of  a  circular  disk.  For  simplicity 
(or  by  choice  of  suitable  physical  units),  we  will  assume  that  the  disk 

D  =  {x2  +  y2  <  1 }  C  R2 

has  unit  radius  and  unit  diffusivity  7  =  1.  We  shall  solve  the  heat  equation  on  D  subject 
to  homogeneous  Dirichlet  boundary  values  of  zero  temperature  at  the  circular  edge 


dD  =  C  ={x2  +y2  =  1  }. 
Thus,  the  full  initial-boundary  value  problem  is 


du  d2u  d2u 

dt  dx2  dy2  ’ 

u(t,x,y )  =  0, 

u(0,x,y)  =  f(x,y), 


x2  +  y2  <  1, 

x2  +  y2  =  1, 

x2  +  y2  <  1. 


t  >  0, 


(11.48) 
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We  remark  that  a  simple  rescaling  of  space  and  time,  as  outlined  in  Exercise  11.4.7,  can 
be  used  to  recover  the  solution  for  an  arbitrary  diffusion  coefficient  and  a  disk  of  arbitrary 
radius  from  this  particular  case. 

Since  we  are  working  in  a  circular  domain,  we  instinctively  pass  to  polar  coordinates 
(r,  9).  In  view  of  the  polar  coordinate  formula  (4.105)  for  the  Laplace  operator,  the  heat 
equation  and  boundary  and  initial  conditions  assume  the  form 


du 

dt 


d2u 
dr 2 


1  du 
r  dr 


+ 


1  d2u 
r2  d62  ’ 


u(t ,  1,  9)  =  0,  u( 0,  r,  9)  =  /(r,  0),  (11.49) 


where  the  solution  u(t,  r,  9)  is  defined  for  all  0  <  r  <  1  and  t  >  0.  To  ensure  that  the 
solution  represents  a  single- valued  function  on  the  entire  disk,  it  is  required  to  be  a  2tt- 
per iodic  function  of  the  angular  variable: 


u{t,  r,  9  +  2n)  =  u{t,  r,  9). 

To  obtain  the  separable  solutions 

u(t,  r,  9)  =  e~xt  v(r,  0),  (11.50) 

we  need  to  solve  the  polar  coordinate  form  of  the  Helmholtz  equation 

d2v  1  dv  1  d2v  0  <  r  <  1,  (  , 

H  71 - 1“  ~~2  ~^2  +  ^  =  0?  n  /  (11.51) 

drz  r  dr  rz  d9z  —  tt  <  #  <  7r, 

subject  to  the  boundary  conditions 

,c(l,0)=O,  v{r,  9  +  2n)  =  v(r,  9).  (11.52) 

To  solve  the  polar  Helmholtz  boundary  value  problem  (11.51-52),  we  invoke  a  further 
separation  of  variables  by  writing 


v(r,  9)  =  p(r)  q{9).  (11.53) 

Substituting  this  ansatz  into  (11.51),  collecting  all  terms  involving  r  and  all  terms  involving 
0,  and  then  equating  both  to  a  common  separation  constant,  we  are  led  to  the  pair  of 
ordinary  differential  equations 

r2  TIT  +  r  T"  +  ^r<2  ~  =  =  (11.54) 

drz  dr  d9z 

where  A  is  the  Helmholtz  eigenvalue,  and  fi  the  separation  constant. 

Let  us  start  with  the  equation  for  q{9).  The  second  boundary  condition  in  (11.52) 
requires  that  q(0)  be  27r-periodic.  Therefore,  the  required  solutions  are  the  elementary 
trigonometric  functions 

q{9)  =  cosm9  or  sin  m0,  where  /r  =  m2,  (11.55) 

with  m  =  0,l, 2,...  a  nonnegative  integer. 

Substituting  the  formula  for  the  separation  constant,  fi  =  ttt2,  the  differential  equation 
for  p(r)  takes  the  form 

r2  +  r  -j-  +  (A  r2  —  m2)  p  =  0,  0  <  r  <  1.  (11.56) 

drz  dr 
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Ordinarily,  one  imposes  two  boundary  conditions  in  order  to  pin  down  a  solution  to  such  a 
second-order  ordinary  differential  equation.  But  our  Dirichlet  condition,  namely  p(  1)  =  0, 
specifies  its  value  at  only  one  of  the  endpoints.  The  other  endpoint  is  a  singular  point  for 
the  ordinary  differential  equation,  because  the  coefficient  of  the  highest-order  derivative, 
namely  r2,  vanishes  at  r  =  0.  This  situation  might  remind  you  of  our  solution  to  the  Euler 
differential  equation  (4.111)  in  the  context  of  separable  solutions  to  the  Laplace  equation 
on  the  disk.  As  there,  we  require  that  the  solution  be  bounded  at  r  =  0,  and  so  seek 
eigensolutions  that  satisfy  the  boundary  conditions 


|  p(0)  |  <  oo,  p(  1)  =  0.  (11.57) 

While  (11.56)  appears  in  a  variety  of  applications,  it  is  more  challenging  than  any 
ordinary  differential  equation  we  have  encountered  so  far.  Indeed,  most  solutions  cannot 
be  written  in  terms  of  the  elementary  functions  (rational  functions,  trigonometric  functions, 
exponentials,  logarithms,  etc.)  you  see  in  first-year  calculus.  Nevertheless,  owing  to  their 
ubiquity  in  physical  applications,  its  solutions  have  been  extensively  studied  and  tabulated, 
and  so  are,  in  a  sense,  well  known,  [85,  86,  119]. 

To  simplify  the  subsequent  analysis,  we  make  a  preliminary  rescaling  of  the  indepen¬ 
dent  variable,  replacing  r  by 

z  =  V\  r. 


(We  know  the  eigenvalue  A  >  0,  since  we  are  dealing  with  a  positive  definite  boundary 
value  problem.)  Note  that,  by  the  chain  rule, 


and  hence 


The  net  effect  is  to  eliminate  the  eigenvalue  parameter  A  (or,  rather,  hide  it  in  the  change 
of  variables),  so  that  (11.56)  assumes  the  slightly  simpler  form 


zzfp+z^  +  (z  2_m2)p  =  0_  (11.58) 

dzz  dz 

The  resulting  ordinary  differential  equation  (11.58)  is  known  as  Bessel' s  equation ,  named 
after  the  early-nineteenth-century  German  astronomer  Wilhelm  Bessel,  who  first  encoun¬ 
tered  its  solutions,  now  known  as  Bessel  functions,  in  his  study  of  planetary  orbits.  Special 
cases  had  already  appeared  in  the  investigations  of  Daniel  Bernoulli  on  vibrations  of  a  hang¬ 
ing  chain,  and  in  those  of  Fourier  on  the  thermodynamics  of  a  cylindrical  body.  To  make 
further  progress,  we  need  to  take  time  out  to  study  their  basic  properties,  and  this  will  re¬ 
quire  us  to  develop  the  method  of  power  series  solutions  of  ordinary  differential  equations. 
With  this  in  hand,  we  can  then  return  to  complete  our  solution  to  the  heat  equation  on  a 
disk. 


11.3  Series  Solutions  of  Ordinary  Differential  Equations 

When  confronted  with  a  novel  ordinary  differential  equation,  we  have  several  available 
options  for  deriving  and  understanding  its  solutions.  For  instance,  the  “look-up”  method 
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relies  on  published  handbooks.  One  of  the  most  useful  references  that  collects  many  solved 
differential  equations  is  the  classic  German  compendium  by  Kamke,  [62].  Two  more  recent 
English-language  handbooks  are  [93, 127],  In  addition,  many  symbolic  computer  algebra 
programs,  including  Mathematica  and  Maple,  will  produce  solutions,  when  expressible 
in  terms  of  both  elementary  and  special  functions,  to  a  wide  range  of  differential  equations. 

Of  course,  use  of  numerical  integration  to  approximate  solutions,  [24,60,80],  is  al¬ 
ways  an  option.  Numerical  methods  do,  however,  have  their  limitations,  and  are  best 
accompanied  by  some  understanding  of  the  underlying  theory,  coupled  with  qualitative 
or  quantitative  expectations  of  how  the  solutions  should  behave.  Furthermore,  numerical 
methods  provide  less  than  adequate  insight  into  the  nature  of  the  special  functions  that 
regularly  appear  as  solutions  of  the  particular  differential  equations  arising  in  separation 
of  variables.  A  numerical  approximation  cannot,  in  itself,  establish  rigorous  mathematical 
properties  of  the  solutions  of  the  differential  equation. 


A  more  classical  means  of  constructing  and  approximating  the  solutions  of  differential 
equations  is  based  on  their  power  series  expansions,  a.k.a.  Taylor  series.  The  Taylor  ex¬ 
pansion  of  a  solution  at  a  point  x0  is  found  by  substituting  a  general  power  series  into  the 
differential  equation  and  equating  coefficients  of  the  various  powers  of  x  —  x0.  The  initial 
conditions  at  x0  serve  to  uniquely  determine  the  coefficients  and  hence  all  the  derivatives  of 
the  solution  at  the  initial  point.  The  Taylor  expansion  of  a  special  function  is  an  effective 
tool  for  deducing  some  of  its  key  properties,  as  well  as  providing  a  means  of  comput¬ 
ing  reasonable  numerical  approximations  to  its  values  within  the  radius  of  convergence  of 
the  series.  (However,  serious  numerical  computations  more  often  rely  on  nonconvergent 
asymptotic  expansions,  [85].) 

In  this  section,  we  provide  a  brief  introduction  to  the  basic  series  solution  techniques  for 
ordinary  differential  equations,  concentrating  on  second-order  linear  differential  equations, 
since  these  form  by  far  the  most  important  class  of  examples  arising  in  applications.  At  a 
regular  point,  the  method  will  produce  a  standard  Taylor  expansion  for  the  solution,  while 
so-called  regular  singular  points  require  a  slightly  more  general  type  of  series  expansion. 
Generalizations  to  irregular  singular  points,  higher-order  equations,  nonlinear  equations, 
and  even  linear  and  nonlinear  systems  are  deferred  to  more  advanced  texts,  including 
54,  59]. 


The  Gamma  Function 

Before  delving  into  the  machinery  of  series  solutions  and  special  functions,  we  need  to 
introduce  the  gamma  function,  which  effectively  generalizes  the  factorial  operation  to  non¬ 
integers.  Recall  that  the  factorial  of  a  nonnegative  integer  n>  0  is  defined  inductively  by 
the  iterative  formula 


n\  =  n  •  (n  —  1)!,  starting  with  0 !  =  1 .  (11.59) 

When  n  is  a  positive  integer,  the  iteration  terminates,  yielding  the  familiar  expression 

n  \  =  n{n  —  1  )(n  -  2)  •  •  •  3  •  2  •  1.  (11.60) 

However,  for  more  general  values  of  n,  the  iteration  never  stops,  and  it  cannot  be  used  to 
compute  its  factorial.  Our  goal  is  to  circumvent  this  difficulty,  and  introduce  a  function 
f{x)  that  is  defined  for  all  values  of  x  and  will  play  the  role  of  such  a  factorial.  First, 
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mimicking  (11.59),  the  function  should  satisfy  the  functional  equation 


/  0)  =xf(x-  1) 


(11.61) 


where  defined.  If,  in  addition,  /( 0)  =  1,  then  we  know  that  f(n)  =  n\  whenever  n  is  a 
nonnegative  integer,  and  hence  such  a  function  will  extend  the  definition  of  the  factorial 
to  more  general  real  and  complex  numbers. 

A  moment’s  thought  should  convince  the  reader  that  there  are  many  possible  ways 
to  construct  such  a  function;  see  Exercise  11.3.6  for  a  nonstandard  example.  The  most 
important  version  is  due  to  Euler.  The  modern  definition  of  Euler’s  gamma  function  relies 
on  an  integral  formula  discovered  by  the  eighteenth-century  French  mathematician  Adrien- 
Marie  Legendre,  who  will  play  a  starring  role  in  Chapter  12. 


Definition  11.2.  The  gamma  function  is  defined  by 


*oo 


r(*) 


e  1  tx  1  dt. 


(11.62) 


o 


The  first  fact  is  that,  for  real  x,  the  gamma  function  integral  converges  only  when 
x  >  0;  otherwise  the  singularity  of  tx~ 1  at  t  =  0  is  too  severe.  The  key  property  that  turns 
the  gamma  function  into  a  substitute  for  the  factorial  function  relies  on  an  elementary 
integration  by  parts: 


■oo 


r(x  +  l)=/  e~ttxdt  =  -e'‘f 

o 


oo 

+  X 

t  =  0  ./  0 


oo 


e~l  tx~1 


dt. 


The  boundary  terms  vanish  whenever  x  >  0,  while  the  final  integral  is  merely  T(x).  There¬ 
fore,  the  gamma  function  satisfies  the  recurrence  relation 


r(x  + 1)  =  x  r(x). 


(11.63) 


If  we  set  f{x)  =  T(x  +  1),  then  (11.63)  becomes  (11.61).  Moreover,  by  direct  integration, 


‘OO 


r(i) 


e  1  dt  —  1 , 


o 


Combining  this  with  the  recurrence  relation  (11.63),  we  deduce  that 

T(n  +  1)  =  n! 


(11.64) 


whenever  n  >  0  is  a  nonnegative  integer.  Therefore,  we  can  identify  x\  with  the  value 
T(x  +  1)  whenever  x  >  —  1  is  any  real  number. 

Remark :  The  reader  may  legitimately  ask  why  not  replace  tx~x  by  tx  in  the  definition 
of  r(z),  which  would  avoid  the  n  +  1  in  (11.64).  There  is  no  good  answer;  we  are  merely 
following  a  well-established  precedent  set  by  Legendre  and  enshrined  in  all  subsequent 
works. 

Thus,  at  integer  values  of  x,  the  gamma  function  agrees  with  the  elementary  factorial. 
A  few  other  values  can  be  computed  exactly.  One  important  case  is  at  x  =  Using  the 
substitution  t  —  s2,  with  dt  =  2s  ds,  we  obtain 


r© 


■OO 


‘OO 


e  1 1  1/2  dt  =  /  2  e  s  ds  = 


(11.65) 


o 


0 
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Figure  11.3. 


The  gamma  function. 


where  the  final  integral  was  evaluated  in  (2.100).  Thus,  using  the  identification  with  the 
factorial  function,  we  identify  this  value  with  ( —  |)  !  =  y/n.  The  recurrence  relation 

(11.63)  will  then  produce  the  value  of  the  gamma  function  at  all  half-integers  |,  |,  |, . . .  . 
For  example, 

r(§)  =  5rd)  =  5V^,  (11-66) 

and  hence  | !  =  |  y/ir.  The  recurrence  relation  can  also  be  employed  to  extend  the  definition 
of  T(x)  to  (most)  negative  values  of  x.  For  example,  setting  x  =  —  \  in  (11.63),  we  have 

r(|)  =  -|r(-i),  so  r(-i)  =  -2r(i)  =  -2v/V 

The  only  points  at  which  this  device  fails  are  the  negative  integers,  and  indeed,  T(x)  has 
a  singularity  when  x  —  —  1,  —  2,—  3,....  A  graplA  of  the  gamma  function  is  displayed  in 
Figure  11.3. 


Remark :  Most  special  functions  of  importance  for  applications  arise  as  solutions  to 
fairly  simple  ordinary  differential  equations.  The  gamma  function  is  a  significant  exception. 
Indeed,  it  can  be  proved,  [11],  that  the  gamma  function  does  not  satisfy  any  algebraic 
differential  equation! 


Regular  Points 

We  are  now  ready  to  develop  the  method  of  series  solutions  to  ordinary  differential  equa¬ 
tions.  Before  we  proceed  to  develop  the  general  computational  machinery,  a  naive  calcula¬ 
tion  in  an  elementary  example  will  be  enlightening. 


^  The  axes  are  at  different  scales;  the  tick  marks  are  at  integer  values. 
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Example  11.3.  Consider  the  initial  value  problem 

<f^+u  =  0,  u(0)  =  1,  w'(0)  =  0.  (11-67) 

dxz 

Let  us  investigate  whether  we  can  construct  an  analytic  solution  in  the  form  of  a  convergent 
power  series 

oo 

u(x)  =  u0  +  u1  x  +  u2  x2  +  u3  x3  +  •••  =  unxn  (11.68) 

n  —  0 

that  is  based  at  the  initial  point  x0  =  0.  Term-by-term  differentiation  yields  the  following 
series  expansions^  for  its  derivatives: 


du 

dx 


u,  +  2u0x  +  3uo  x2  +  Auaxs  + 


d2u 

dx2 


2  u2  +  6u3x  +  12  ^4x2  +  20  u5x3  + 


oo 

T  0+  l)«n+l*n> 
n  =  0 
oo 

T  0 +  !)(«  +  2 )Un+2Xn- 

n  —  0 


(11.69) 


The  next  step  is  to  substitute  the  series  (11.68-69)  into  the  differential  equation  and  collect 
common  powers  of  x: 


d2u 

dx2 


-\-  u  —  (2 u2  T  ^o)  T  (6  u3  +  tq)  x  T  (12  rq  4-  rq)  ^  H-  (20  u3  4-  tq)  4~ 


=  0. 


At  this  point,  one  focuses  attention  on  the  individual  coefficients,  appealing  to  the  following 
basic  observation: 


Two  convergent  power  series  are  equal  if  and  only  if  all  their  coefficients  are  equal. 


In  particular,  a  power  series  represents  the  zero  function^  if  and  only  if  all  its  coefficients 
are  0.  In  this  manner  we  obtain  the  following  infinite  sequence  of  algebraic  recurrence 
relations  among  the  coefficients: 


x 


x 

2 


x 


x 


2u2  +  u0  =  0, 
6rq  +  iq  =  0, 
12n4  +  u2  =  0, 
20  u5  T  u3  =  0, 
30  Vjq  — (-  rq  —  0, 


(11.70) 


xn  (n  +  1)  (n  +  2)  un+2  +  un  =  0. 

Now,  the  initial  conditions  serve  to  prescribe  the  first  two  coefficients: 

n(0)  =  u0  =  1,  ffi(0)  =  rq  =  0. 


^  When  working  with  the  series  in  summation  form,  it  helps  to  re-index  in  order  to  display 
the  term  of  degree  n. 

Here  it  is  essential  that  we  work  with  analytic  functions,  since  this  result  is  not  true  for 

C°°  functions!  For  example,  the  function  e  /x  has  identically  zero  power  series  at  x0  =  0;  see 
Exercise  11.3.21. 
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1  1 

We  then  solve  the  recurrence  relations  in  order:  The  first  determines  u2  =  —  ^u0  =  — 

1  11  1 
the  second,  u3  =  —  =  0;  next,  uA  =  —  then  u5  =  —  ^ us  =  0;  then 

u6  =  ~  =  —  7^0’  an<^  so  on*  general?  it  is  not  hard  to  see  that 

_  (-l)fe 

u2k  ~(2/k)  \  ’  u2k+i  t),  k  0, 1, 2,  ...  . 

Hence,  the  required  series  solution  is 


°°_  ( _ i  \k 

u(x)  -  1  _  I-r.2  ,  J_  3  _  _L_  6  ,  ...  =  l  L)  2k 

or/  y *aj  j  _l  ^  ^  i  24  720  ^  y/  ^  i  |  ^  ^ 

rv  ! 

fc  =  0 

which,  by  the  ratio  test,  converges  for  all  x.  We  have  thus  recovered  the  well-known  Taylor 
series  for  cosx,  which  is  indeed  the  solution  to  the  initial  value  problem.  Changing  the 
initial  conditions  to  'u(O)  =  =  0,  i/(0)  =  u1  =  1,  will  similarly  produce  the  usual 

Taylor  expansion  of  sinx.  Note  that  the  generation  of  the  Taylor  series  does  not  rely  on 
any  a  priori  knowledge  of  trigonometric  functions  or  the  direct  solution  method  for  linear 
constant-coefficient  ordinary  differential  equations. 


Building  on  this  experience,  let  us  describe  the  general  method.  We  shall  concentrate 
on  solving  a  second-order  homogeneous  linear  differential  equation 


du 

+  qix)  — — b  r(x)  u  =  0. 
ax 


(11.71) 


The  coefficients  p(x),q(x),r(x)  are  assumed  to  be  analytic  functions  on  some  common 
domain.  This  means  that,  at  a  point  x0  within  the  domain,  they  admit  convergent  power 
series  expansions 

p(x)  =p0+Pl(x-  x0 )  +  p2  (x  -  x0)2  +  •  •  •  , 

q(x)  =  q0  +  qi(x-  x0)  +q2(x-  x0 )2  +  •  •  •  ,  (11.72) 

r(x )  =  r0  +  r±  (x  —  x0)  +  r2  (x  —  x0 )2  +  •  •  •  . 

We  expect  that  solutions  to  the  differential  equation  are  also  analytic.  This  expectation  is 
justified,  provided  that  the  equation  is  regular  at  the  point  x0,  in  the  following  sense. 


Definition  11.4.  A  point  x  =  x0  is  a  regular  point  of  a  second-order  linear  ordinary 
differential  equation  (11.71)  if  the  leading  coefficient  does  not  vanish  there: 


Po  =  p(x  o)  +  o. 

A  point  where  p(x0)  =  0  is  known  as  a  singular  point. 

In  short,  at  a  regular  point,  the  second-order  derivative  term  does  not  disappear,  and 
so  the  equation  is  “genuinely”  of  second  order. 


Remark :  The  definition  of  a  singular  point  assumes  that  the  other  two  coefficients  do 
not  also  vanish  there,  so  that  either  q(x0 )  ^  0  or  r(x0 )  ^  0.  If  all  three  functions  happen 
to  vanish  at  x0,  we  can  cancel  any  common  factor  (  X  X  Q  )  ,  and  hence,  without  loss  of 
generality,  assume  that  at  least  one  of  the  coefficient  functions  is  nonzero  at  x0. 


Proofs  of  the  basic  existence  theorem  for  differential  equations  at  regular  points  can 
be  found  in  [18,  54,  59]. 
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Theorem  11.5.  Let  x0  be  a  regular  point  for  the  second-order  homogeneous  linear 
ordinary  differential  equation  (11.71).  Then  there  exists  a  unique,  analytic  solution  u(x) 
to  the  initial  value  problem 

u(x0)  =  a,  u'(x0 )  =  b.  (11.73) 

The  radius  of  convergence  of  the  power  series  for  u(x)  is  at  least  as  large  as  the  distance 
from  the  regular  point  x0  to  the  nearest  singular  point  of  the  differential  equation  in  the 
complex  plane. 

Thus,  every  solution  to  an  analytic  differential  equation  at  a  regular  point  x0  can  be 
expanded  in  a  convergent  power  series 

oo 

u(x)  =  Uq  +  U1(x  —  Xq)  +  U2(x  —  Xq)2  +  Un(x  —  X0)n.  (11.74) 

n  —  0 

Since  the  power  series  necessarily  coincides  with  the  Taylor  series  for  u{x),  its  coefficients^ 

_  u^n\xQ) 
n  n\ 

are  multiples  of  the  derivatives  of  the  function  at  the  point  x0.  In  particular,  the  first  two 
coefficients, 

u0  =  u(x  0)  =  a,  u1  =  u'(x  0)  =  b ,  (11.75) 

are  fixed  by  the  initial  conditions.  The  remaining  coefficients  will  then  be  uniquely  pre¬ 
scribed  thanks  to  the  uniqueness  of  solutions  to  initial  value  problems. 

Near  a  regular  point,  the  second-order  differential  equation  (11.71)  admits  two  linearly 
independent  analytic  solutions,  which  we  denote  by  u(x)  and  u{x).  The  general  solution 
can  be  written  as  a  linear  combination  of  the  two  basis  solutions: 


u(x)  =  au(x )  +  bu(x). 

(11.76) 

A  convenient  choice  is 

to  have  the  first  satisfy  the  initial  conditions 

u(x0)  =  1, 

u'(x  q)  =  0, 

(11.77) 

and  the  second  satisfy 

TT 

o 

O 

u'(x0)  =  1, 

(11.78) 

although  other  conventions  may  be  used  depending  on  the  circumstances.  Given  (11.77- 
78),  the  linear  combination  (11.76)  automatically  satisfies  the  initial  conditions  (11.73). 

The  basic  computational  strategy  to  construct  the  power  series  solution  to  the  initial 
value  problem  is  a  straightforward  adaptation  of  the  method  used  in  Example  11.3.  One 
substitutes  the  known  power  series  (11.72)  for  the  coefficient  functions  and  the  unknown 
power  series  (11.74)  for  the  solution  into  the  differential  equation  (11.71).  Multiplying  out 
the  formulas  and  collecting  the  common  powers  of  x  —  x0  will  result  in  a  (complicated) 
power  series  whose  individual  coefficients  must  be  equated  to  zero.  The  lowest-order  terms 
are  multiples  of  (x  —  x0)°  =  1,  i.e.,  the  constant  terms.  They  produce  a  linear  relation 

u2  =  R2(u0,  u^)  =  R2{a,b ) 


t  Some  authors  prefer  to  include  the  n!’s  in  the  original  power  series;  this  is  purely  a  matter 
of  personal  taste. 
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that  prescribes  the  coefficient  u2  in  terms  of  the  initial  data  (11.75).  The  coefficient  of 
^  00  00  q  )  leads  to  a  relation 

u3  =  u2)  =  R3(a,b,R2(a,b)) 

that  prescribes  u3  in  terms  of  the  initial  data  and  the  previously  computed  coefficient  u2. 
And  so  on.  At  the  nth  stage  of  the  procedure,  the  coefficient  of  (  X  X  q  )  produces  the 
linear  recurrence  relation 


u 


n+2  Rfi(U 0’  *  *  *  ?  Un+ 1) 


n  —  0, 1,  2, . . . 


(11.79) 


that  will  prescribe  the  (n  +  2)nd  order  coefficient  in  terms  of  the  previously  computed 
coefficients.  In  this  fashion,  we  will  have  constructed  a  formal  power  series  solution  to  the 
differential  equation  at  a  regular  point.  The  one  remaining  issue  is  whether  the  resulting 
power  series  actually  converges.  The  full  analysis  can  be  found  in  [54,  59],  and  will  serve 
to  complete  the  proof  of  the  general  Existence  Theorem  11.5. 

Rather  than  continue  on  in  general,  the  best  way  to  learn  the  method  is  to  work 
through  another,  less  trivial,  example. 


The  Airy  Equation 


We  will  illustrate  the  procedure  by  constructing  power  series  solutions  to  the  Airy  equation 


d2u 
dx 2 


=  xu. 


(11.80) 


This  second-order  linear  ordinary  differential  equation,  which  arises  in  applications  to  op¬ 
tics,  rainbows,  and  dispersive  waves,  has  solutions  that  cannot  be  expressed  in  terms  of 
elementary  functions. 

For  the  Airy  equation  (11.80),  the  leading  coefficient  is  constant,  and  so  every  point 
is  a  regular  point.  For  simplicity,  we  will  look  only  for  power  series  based  at  the  origin 
x0  =  0,  and  therefore  of  the  form  (11.68).  Equating  the  two  series 


oo 

u"(x)  =  2u2  +  6u3x  +  12u4x2  +  20u5x3  +  •••  =  (n  +  l)(n  +  2) un+2xn1 

n  —  0 
oo 

x  u(x)  =  u0x  +  u-^x2 u2x3  +  •••  =  un_1xn , 

n—  1 

leads  to  the  following  recurrence  relations  relating  the  coefficients: 


1 

2  u2 

=  0, 

X 

gs 

CO 

=  a0, 

X2 

12n4 

= 

x3 

20n5 

=  u2l 

xA 

30  u6 

=  u3 > 

xn 

(n  +  l)(n  +  2)un+2 

=  Un- 
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As  before,  we  solve  them  in  order:  The  first  equation  determines  u2.  The  second  prescribes 
u3  =  |  u0  in  terms  of  u0.  Next,  we  find  u4  =  ^  iq  in  terms  of  iq,  followed  by  u5  =  ^ u2  = 
0;  then  u6  =  -^u3  =  is  first  given  in  terms  of  u3,  but  we  already  know  the  latter  in 

terms  of  u0.  And  so  on. 

Let  us  now  construct  two  basis  solutions.  The  first  has  the  initial  conditions 


u0  =  u(0)  =  1,  u1  =  u'( 0)  =  0. 

The  recurrence  relations  imply  that  the  only  nonzero  coefficients  cn  occur  when  n  =  3  k  is 
a  multiple  of  3.  Moreover, 

_  U3k-3 

U3k  ~  3k(3k  —  1)  ' 

A  straightforward  induction  proves  that 

_  1 

U3k  ~  3 (3  A:  —  l)(3fc  —  3)(3fc  —  4)  •  •  •  6  •  5  •  3  •  2  ' 

The  resulting  solution  is 


oo 


u(x)  =  l  +  |x3  + 


180 


X6  + 


x 


3k 


k=  1 


3k  (3k-  1)(3  k-  3)  (3  k  -4)-*-6*5-3-2 


(11.81) 


Note  that  the  denominator  is  similar  to  a  factorial,  except  every  third  term  is  omitted. 
A  straightforward  application  of  the  ratio  test  confirms  that  the  series  converges  for  all 
(complex)  x,  in  conformity  with  the  general  Theorem  11.5,  which  guarantees  an  infinite 
radius  of  convergence  because  the  Airy  equation  has  no  singular  points. 

Similarly,  starting  with  the  initial  conditions 


uQ  =  u(  0)  =  0,  u l  =  ii;(0)  =  1, 

we  find  that  the  only  nonzero  coefficients  un  occur  when  n  =  3  k  +  1.  The  recurrence 
relation 


u 


u< 


3  k—2 


’3k+l  (3fc  +  l)(3fc) 
The  resulting  solution  is 


yields  u3k+1 


1 


(3k  +  l)(3k)(3k  -  2)(3  k  -3)---7-6-4-3 


OO 


u(x)  =  X+  jr}X4+-^X7-\ - =£  + 


X 


3k+l 


k=  1 


(3k  +  l)(3k)(3k  -  2)  (3  k  -3)“*7*6-4*3 


(11.82) 

Again,  the  denominator  skips  every  third  term  in  the  product.  Every  solution  to  the  Airy 
equation  can  be  written  as  a  linear  combination  of  these  two  basis  power  series  solutions: 


u(x)  =  a  u(x)  +  bu(x),  where  a  =  u( 0),  b  =  u'( 0). 

Both  power  series  (11.81,  82),  converge  quite  rapidly,  and  so  the  first  few  terms  will  provide 
a  reasonable  approximation  to  the  solutions  for  moderate  values  of  x. 

We  have,  in  fact,  already  encountered  another  solution  to  the  Airy  equation.  According 
to  formula  (8.97),  the  integral 


Ai(x) 


1 


cos 


(sx  + 


|  s3)  ds 


7T 


(11.83) 
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defines  the  Airy  function  of  the  first  kind.  Let  us  prove  that  it  satisfies  the  Airy  differential 
equation  (11.80): 


d2 
dx 2 


Ai(x) 


x  Ai(x). 


Before  differentiating,  we  recall  the  integration  by  parts  argument  in  (8.96)  to  re-express 
the  Airy  integral  in  absolutely  convergent  form: 


Ai(x) 


2  f°°  s  sin(sx  +  ^ s3) 

~  /  - /  ,  2\2 - ds- 

7T  Jo  (X  +  S2)2 


We  are  now  permitted  to  differentiate  under  the  integral  sign,  producing  (after  some  alge¬ 
bra) 


d - 


dx2 


2  f 

Ai(x)  —  x  Ai(x)  =  — 

71  J  o 


oo 


d 

ds 


s  {pc  +  s 2)  cos (sx  +  |  s3)  —  sin (sx  +  |  s 3) 

(x  +  s2)3 


ds  =  0. 


Thus,  the  Airy  function  must  be  a  certain  linear  combination  of  the  two  basic  series  solu¬ 
tions: 

Ai(x)  =  Ai(0)  u(x)  +  Ai7(0)  u(x). 


Its  values  at  x  =  0  are,  in  fact,  given  by 


1  r°° 

Ai(0)  =  —  J  cosQs3)  ds  = 


r(§)  i 

2tt31/6  32/3r(|) 


.355028 , 


Ai'(0)  =  -  - 

7 T 


i  r°°  31/6r(-'i 

-  /  s  sin  (is3)  ds  = - 

K  Jo  {3  J  2 7T 


(11.84) 


31/3  r(l) 


-.258819. 


The  second  and  third  expressions  involve  the  gamma  function  (11.62);  a  proof,  based  on 
complex  integration,  can  be  found  in  [85;  p.  54], 


Exercises 


11.3.1.  Find  (a)  r (§),  (b)  r(|),  (c)  r (-§),  (d)  r (-§). 

11.3.2.  Prove  that  p(n  +  4)  =  Cn)  •  £or  every  positive  integer  n. 

v  2/  2 zn  n\ 

11.3.3.  Let  x  G  C  be  complex,  (a)  Prove  that  the  gamma  function  integral  (11.62)  converges, 

provided  Hex  >  0.  (b)  Is  formula  (11.63)  valid  when  x  is  complex? 

0  11.3.4.  Prove  that  T(x)  =  /  (—  logs)x  ds ,  and  hence,  for  0  <  n  E  Z,  we  have 

r  l  ^ 

n!  =  /  (— logs)nds.  Remark :  Euler  first  established  the  latter  identity  directly,  and  used 

J  o 

it  to  define  the  gamma  function. 


00  -  -x3 


11.3.5.  Evaluate  /  \/x  e 

Jo 


dx. 


0  11.3.6.  Can  you  construct  a  function  f(x)  that  satisfies  the  factorial  functional  equation  (11.61) 
and  has  the  values  f(x)  =  1  for  0  <  x  <  1?  If  so,  is  f(pc)  =  T(x  +  1)? 
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11.3.7.  Explain  how  to  construct  the  power  series  for  sinx  by  solving  the  differential  equation 
(11.67). 

11.3.8.  Construct  two  independent  power  series  solutions  to  the  Euler  equation  x2u"  —  2u  =  0 
based  at  the  point  xQ  =  1. 

11.3.9.  Construct  two  independent  power  series  solutions  to  the  equation  u"  +  x2u  =  0  based  at 
the  point  x0  =  0. 

11.3.10.  Consider  the  ordinary  differential  equation  u"  +  2xu  +  2u  =  0.  (a)  Find  two  linearly 
independent  power  series  solutions  in  powers  of  x.  (b)  What  is  the  radius  of  convergence 
of  your  power  series?  (c)  By  inspection  of  your  series,  find  one  solution  to  the  equation 
expressible  in  terms  of  elementary  functions,  (d)  Find  an  explicit  (non-series)  formula  for 
the  second  independent  power  series  solution. 

11.3.11.  Answer  Exercise  11.3.10  for  the  equation  u'  +  ^xu—  \  u  =  0,  which  is  a  special  case 
of  equation  (8.63). 

11.3.12.  Consider  the  ordinary  differential  equation  u"  +  xu  +  2u  =  0.  (a)  Find  two  linearly 

independent  power  series  solutions  based  at  xQ  =  0.  (b)  Write  down  the  power  series  for 

the  solution  to  the  initial  value  problem  a(0)  =  1,  i/(0)  =  —  1.  (c)  What  is  the  radius  of 
convergence  of  your  power  series  solution  in  part  (a)?  Can  you  justify  this  by  direct  inspec¬ 
tion  of  your  power  series? 


0  11.3.13.  The  Hermite  equation  of  order  n  is 


d2u 


du 


—  2x  — - b  2 n u  =  0. 


(11.85) 


dx 2  ~  dx 

Assuming  n  E  N  is  a  nonnegative  integer:  (a)  Find  two  linearly  independent  power  series 
solutions  based  at  x0  =  0,  and  then  show  that  one  of  your  solutions  is  a  polynomial  of 
degree  n.  (b)  Prove  that  the  Hermite  polynomial  Hn(x)  defined  in  (8.64)  solves  the 
Hermite  equation  (11.85)  and  hence  is  a  multiple  of  the  polynomial  solution  you  found  in 
part  (a).  What  is  the  multiple?  (c)  Prove  that  the  Hermite  polynomials  are  orthogonal 

/OO  —  X2 

u(x)v{x)e  dx. 

-oo 

11.3.14.  Use  the  ratio  test  to  directly  determine  the  radius  of  convergence  of  the  series  solu¬ 
tions  (11.81,  82)  to  the  Airy  equation. 

11.3.15.  Write  down  the  general  solution  to  the  following  ordinary  differential  equations: 

(a)  u "  +  {x  —  c)  u  =  0,  where  c  is  a  fixed  constant; 

(b)  u"  =  A  xu,  where  A  ^  0  is  a  fixed  nonzero  constant. 


0  11.3.16.  The  Airy  function  of  the  second  kind  is  defined  by 


1 


Bi(x)  =  — 


‘OO 


7 r  J  o 


exp ( s x 


1 

3  6 


■  -  ,  i  1  3 

+  sin  si  +  o  s 


ds. 


(11.86) 


(a)  Prove  that  Bi(x)  is  well  defined  and  a  solution  to  the  Airy  equation,  (b)  Given  that^ 


Bi(0)  = 


1 


Bi'(0)  = 


31/6 

rd 


(11.87) 


31/6  r(§)  ’ 

explain  why  every  solution  to  the  Airy  equation  can  be  written  as  a  linear  combination  of 
Ai(x)  and  Bi(x).  (c)  Write  the  two  series  solutions  (11.81,  82)  in  terms  of  Ai(x)  and  Bi(x). 

o 

11.3.17.  Use  the  Fourier  transform  to  construct  an  L  solution  to  the  Airy  equation.  Can  you 
identify  your  solution? 

11.3.18.  Apply  separation  of  variables  to  the  Tricomi  equation  (4.137),  and  write  down  all 
separable  solutions.  Hint :  See  Exercise  11.3.15(b)  and  Exercise  11.3.16. 


^  See  [85;  p.  54]  for  a  proof. 
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oo 

T  11.3.19.  (a)  Show  that  u(x)  =  X  O-l)!  xn  is  a  power  series  solution  to  the  first-order  linear 

n—  1 

ordinary  differential  equation  x2u  —  u  +  x  =  0.  (b)  For  which  x  does  the  series  converge? 
(c)  Find  an  analytic  formula  for  the  general  solution  to  the  equation,  (d)  Find  a  second- 
order  homogeneous  linear  ordinary  differential  equation  that  has  this  power  series  as  a 
(formal)  solution.  Remark :  The  lesson  of  this  exercise  is  that  not  all  power  series  solutions 
to  ordinary  differential  equations  converge.  Theorem  11.5  guarantees  convergence  at  a  regu¬ 
lar  point,  but  in  this  example  the  power  series  is  based  at  the  singular  point  x0  =  0. 


11.3.20.  True  or  false:  The  only  function  f(x)  that  has  identically  zero  Taylor  series  is  the  zero 
function. 

—l/x2  /  n 

e  j  x  ^  u,  prove  that  /  is  a  C°°  function  for  all  xGl. 

0,  x  =  0. 


11.3.21.  Define  f(x) 


(b)  Prove  that  f(x)  is  not  analytic  by  showing  that  its  Taylor  series  at  x0  =0  does  not 
converge  to  f(x)  when  x/0. 


Regular  Singular  Points 


As  we  have  just  seen,  constructing  power  series  solutions  at  regular  points  is  a  reasonably 
straightforward  computational  exercise:  one  writes  down  a  power  series  with  arbitrary 
coefficients,  substitutes  into  the  differential  equation  along  with  a  pair  of  initial  conditions, 
and  recursively  solves  for  the  coefficients.  Finding  a  general  formula  for  the  coefficients 
might  be  challenging,  but  producing  their  successive  numerical  values,  degree  by  degree,  is 
a  mechanical  exercise. 

However,  at  a  singular  point,  the  solutions  cannot  be  typically  written  as  an  ordinary 
power  series,  and  one  needs  to  be  cleverer.  Of  course,  you  may  object  —  why  not  just  solve 
the  equation  away  from  the  singular  point  and  be  done  with  it.  But  there  are  multiple 
reasons  not  to  do  this.  First,  one  may  be  unable  to  discover  a  general  formula  for  the 
power  series  coefficients  at  regular  points.  Second,  the  most  informative  and  interesting 
behavior  of  solutions  is  typically  found  at  the  singular  points,  and  so  series  solutions  based 
at  singular  points  are  particularly  enlightening.  And  finally,  one  of  the  boundary  conditions 
required  for  us  to  complete  our  construction  of  separable  solutions  to  partial  differential 
equations  often  occurs  at  a  singular  point. 

Singular  points  appear  in  two  guises.  The  easier  to  handle,  and,  fortunately,  the  ones 
that  arise  in  almost  all  applications,  are  known  as  “regular  singular  points”.  Irregular 
singular  points  are  nastier,  and  we  will  not  make  any  attempt  to  understand  them  in  this 
text;  the  curious  reader  is  referred  to  [54,  59]. 

Definition  11.6.  A  second-order  linear  homogeneous  ordinary  differential  equation 
that  can  be  written  the  form 


d2u 


du 


(x  —  x0 )z  a(x)  ~—r  +  (x  —  x0)  b(x )  — — b  c(x)  u  —  0, 


(11.88) 


dx2  v  u/  v  7  dx 

where  a(x),  6(x),  and  c{x)  are  analytic  at  x  =  x0  and,  moreover,  a(x0)  ^  0,  is  said  to  have 
a  regular  singular  point  at  x0. 

The  simplest  example  of  a  second-order  equation  with  a  regular  singular  point  at 
x0  =  0  is  the  Euler  equation 

ax2u"  +  bxu!  +  cu  —  0,  (11.89) 
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with  a,  6,  c  all  constant  and  a  /  0.  Note  that  all  other  points  are  regular  points.  Euler 
equations  can  be  readily  solved  by  substituting  the  power  ansatz  u(x)  =  xr .  We  find 

ax2  u"  +  bxu'  +  cu  =  ar(r  —  l)xr  +  brxr  +  cxr  =  0, 
provided  the  exponent  r  satisfies  the  indicial  equation 

ar{r  —  l)  +  6r  +  c  =  0. 

If  this  quadratic  equation  has  two  distinct  roots  rq  Y  r2,  we  obtain  two  linearly  independent 
(possibly  complex)  solutions  u{x)  =  xri  and  u(x)  =  xr2 .  The  general  solution  u{x)  = 
Clxri  +  c2xr2  is  a  linear  combination  thereof.  Note  that  unless  rq  or  r2  is  a  nonnegative 
integer,  all  nonzero  solutions  have  a  singularity  at  the  singular  point  x  =  0.  A  repeated  root, 
rq  =r2,  has  only  one  power  solution,  u{x)  =  xri,  and  requires  an  additional  logarithmic 
term,  u(x)  =  xrilogx,  for  the  second  independent  solution.  In  this  case,  the  general 
solution  has  the  form  u{x)  =  cxxri  +  c2xri  logx. 

The  series  solution  method  at  more  general  regular  singular  points  is  modeled  on  the 
simple  example  of  the  Euler  equation.  One  now  seeks  a  solution  that  has  a  series  expansion 
of  the  form 

oo 

u(pc)  =  (x—XqY  un(x— x0)n  =  u0(x—x0)r-\-u1(x—x0)r+1-\-u2(x—x0)r+2-\-’  •  • .  (11.90) 

n  —  0 

The  exponent  r  is  known  as  the  index.  If  r  =  0,  or,  more  generally,  if  r  is  a  positive 
integer,  then  (11.90)  is  an  ordinary  power  series,  but  we  allow  the  possibility  of  a  non¬ 
integral,  or  even  complex,  index  r.  We  can  assume,  without  any  loss  of  generality,  that  the 
leading  coefficient  u0  Y  0.  Indeed,  if  uk  Y  0  is  the  first  nonzero  coefficient,  then  the  series 
begins  with  the  term  uk[x  —  x0)r+k,  and  we  merely  replace  r  by  r  +  k  to  write  it  in  the 
form  (11.90).  Since  any  scalar  multiple  of  a  solution  is  a  solution,  we  can  further  assume 
that  u0  =  1,  in  which  case  we  call  (11.90)  a  normalized  Frobenius  series  in  honor  of  the 
German  mathematician  Georg  Frobenius,  who  systematically  established  the  calculus  of 
series  solutions  at  regular  singular  points  in  the  late  1800s.  The  index  r,  and  the  higher- 
order  coefficients  rq,  u2: . . .,  are  then  found  by  substituting  the  normalized  Frobenius  series 
into  the  differential  equation  (11.88)  and  equating  the  coefficients  of  the  powers  of  x  —  x0 
to  zero. 

Warning :  Unlike  those  in  ordinary  power  series  expansions,  the  coefficients  u0  =  1 
and  u±  are  not  prescribed  by  the  initial  conditions  at  the  point  x0. 

Since 


u{x)  =  {x  —  x0)r  +  u±(x  —  x0Y+1  +  *  •  *  7 
(x  —  xQ)  u\x)  =  r  (x  —  x0Y  +  (r  +  1  )u1(x  —  x0)r+1  +  •  •  •  , 

(x  —  x0)2  u"(x)  =  r  (r  —  1)  (x  —  x0Y  +  (r  +  1)  ru1(x  —  £0)r+1  +  •  •  •  , 

the  terms  of  lowest  order  in  the  equation  are  multiples  of  (  tJO  Q  ^  •  Equating  their  coeffi¬ 

cients  to  zero  produces  a  quadratic  equation  of  the  form 

s0  r  (r  -  1)  +  tQ  r  +  r0  =  0,  (11.91) 

where 

So  =  «(*o)  =  5p"(®o).  *o  =  *(®o)  =  9'(*o)>  ro  =  r(xo ). 
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are  the  leading  coefficients  in  the  power  series  expansions  of  the  individual  coefficient  func¬ 
tions.  The  quadratic  equation  (11.91)  is  known  as  the  indicial  equation ,  since  it  determines 
the  possible  indices  r  in  the  Frobenius  expansion  (11.90)  of  a  solution. 

As  with  the  Euler  equation,  the  quadratic  indicial  equation  usually  has  two  roots, 
say  r1  and  r2,  which  provide  two  allowable  indices,  and  one  thus  expects  to  find  two 
independent  Frobenius  expansions.  Usually,  this  expectation  is  realized,  but  there  is  an 
important  exception.  The  general  result  is  summarized  in  the  following  list: 

(z)  If  r2  —  r1  is  not  an  integer,  then  there  are  two  linearly  independent  solutions  u(x)  and 
u(pc),  each  having  convergent  normalized  Frobenius  expansions  of  the  form  (11.90). 

(zz)  If  r1  =  r2 ,  then  there  is  only  one  solution  u(x)  with  a  normalized  Frobenius  expansion 
(11.90).  One  can  construct  a  second  independent  solution  of  the  form 

oo 

u(x)  =  log(x  —  xQ)  u(x)  +  v(x),  where  v(x)  =  vn(x  —  x0)n+ri  (11.92) 

n  —  1 

is  a  convergent  Frobenius  series. 

(zzz)  Finally,  if  r2  =  rx  +  fc,  where  k  >  0  is  a  positive  integer,  then  there  is  a  nonzero 
solution  u(x)  with  a  convergent  Frobenius  expansion  corresponding  to  the  smaller 
index  rx.  One  can  construct  a  second  independent  solution  of  the  form 


oo 

u[x)  =  clog(x  —  x0)  u(pc)  +v{x),  where  v(x)  =  xr2  +  vn{x  ~  x0)n+r2  (11.93) 

n—  1 

is  a  convergent  Frobenius  series,  and  c  is  a  constant,  which  may  be  0,  in  which  case 
the  second  solution  u{x)  is  also  of  Frobenius  form. 

Thus,  in  every  case,  the  differential  equation  has  at  least  one  nonzero  solution  with  a  con¬ 
vergent  Frobenius  expansion.  If  the  second  independent  solution  does  not  have  a  Frobenius 
expansion,  then  it  requires  an  additional  logarithmic  term  of  a  well-prescribed  form.  Rather 
than  try  to  develop  the  general  theory  in  any  more  detail  here,  we  will  content  ourselves 
to  work  through  a  couple  of  particular  examples. 


Example  11.7.  Consider  the  second-order  ordinary  differential  equation 


d2u 

dx2 


+ 


du 

dx 


“t“  u 


0. 


(11.94) 


We  look  for  series  solutions  based  at  x  =  0.  Note  that,  upon  multiplying  by  x2,  the 
equation  takes  the  form 

x2u"  +  x(l  +  \  x2^u'  +  x2u  =  0, 

and  hence  x0  =  0  is  a  regular  singular  point,  with  a(x)  =  1,  b(x)  =  1  +  c{pc)  =  x2 . 

We  thus  look  for  a  solution  that  can  be  represented  by  a  Frobenius  expansion: 


u(x)  =  xr  +  u±  xr+1  +  •  •  •  +  un  xn+r  +  •  •  •  , 
x  u\x)  =  r  xr  +  (r  +  1  )u±  xr+1  +  •  •  •  +  (n  +  r)un  xn+r  +  •  •  •  , 

7}X3u'(x)  =  |rxr+2  +  |(r  +  1  )u1xr+3  +  •  •  •  +  \  (n  +  r  —  2 )un_2xn+r  +  •  •  • 
x2u"(x)  =  r(r  —  1)  xr  +  (r  +  1  )ru1  xr+1  +  •  •  •  +  (n  +  r){n  +  r  —  1  )unxn+r 


(11.95) 
+  •••  . 
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Substituting  into  the  differential  equation,  we  find  that  the  coefficient  of  xr  leads  to  the 
indicial  equation 

r2  =  0. 

There  is  only  one  root,  r  —  0,  and  hence,  even  though  we  are  at  a  singular  point,  the 
Frobenins  expansion  reduces  to  an  ordinary  power  series.  The  coefficient  of  xr+1  =  x  tells 
ns  that  u±  =  0.  The  general  recurrence  relation,  for  n  >  2,  is 


n2un  +  \  nun_2  =  0, 


and  hence 


u 


u 


n 


n— 2 

2  n 


Therefore,  the  odd  coefficients  u2k+1  =  0  are  all  zero,  while  the  even  ones  are 


u 


U2 k  ~  ‘ 


2k— 2 


U 


2k— 4 


U 


2k— 6 


k 


4fc  4fc(4fc  —  4)  4fc(4fc- 4)(4fc-  8) 
The  resulting  power  series  assumes  a  recognizable  form: 


(-i) 

4fc  k ! 


since  u0  =  1. 


oo 


U 


(*)  =  T 


U2kX 


2k 


k=  1 


±h(- 


X  \  =e-x2/4 


k=  1 


which  is  an  explicit  elementary  solution  to  the  ordinary  differential  equation  (11.94). 

Since  there  is  only  one  root  to  the  indicial  equation,  the  second  solution  u(x)  will 
require  a  logarithmic  term.  It  can  be  constructed  by  a  second  application  of  the  Frobenius 
method  using  the  more  complicated  form  (11.92).  Alternatively,  since  the  first  solution 
is  known,  we  can  use  a  well-known  reduction  trick,  [23].  Given  one  solution  u(x)  to  a 
second-order  linear  ordinary  differential  equation,  the  general  solution  can  be  found  by 
substituting  the  ansatz 


u(x)  =  v(x)  u(x )  =  v(x)  e 


—  x2  /  4 


(11.96) 


into  the  equation.  In  this  case, 


u"  +  (  — b  77  )  u'  +  u 
x  2 


=  v 


// 


1  X 

u"  +  1  X  +  2 


u'  +  u 


+  v' 


2  u'  + 


1  X 

x  +  2]U 


+  v"  u 


-  v-x2/4 


v"  +  — 

X 


If  u  is  to  be  a  solution,  v'  must  satisfy  a  linear  first-order  ordinary  differential  equation: 

/ 


//  .  v 


v  -\ - =  0. 

X 


and  hence 


/  c 
v  =  -  , 
x  ’ 


v  —  c  log  x  +  d. 


where  c,  d  are  arbitrary  constants.  We  conclude  that  the  general  solution  to  the  original 
differential  equation  is 


u(x)  =  v(x)  u(x)  =  (c  logx  +  d)  e  x  ^ , 


(11.97) 


Bessel’s  Equation 

Perhaps  the  most  important  “non-elementary”  ordinary  differential  equation  is 

x2  u,r  +  x  v!  +  ( x 2  —  m2)  u  =  0, 


(11.98) 
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known  as  Bessel’s  equation  of  order  m.  We  assume  here  that  the  order  m  >  0  is  a 
nonnegative  real  number.  (Exercise  11.3.30  investigates  Bessel  equations  of  imaginary 
order.)  The  Bessel  equation  arises  from  separation  of  variables  in  a  variety  of  partial 
differential  equations,  including  the  Laplace,  heat,  and  wave  equations  on  a  disk,  a  cylinder, 
and  a  spherical  ball. 

The  Bessel  equation  cannot  (except  in  a  few  particular  instances)  be  solved  in  terms 
of  elementary  functions,  and  so  the  use  of  power  series  is  essential.  The  leading  coefficient, 
p{x)  =  x2,  is  nonzero  except  when  x  =  0,  and  so  all  points  except  the  origin  are  regular. 
Therefore,  at  any  x0  ^  0,  the  standard  power  series  construction  can  be  used  to  produce 
the  solutions  of  the  Bessel  equation.  However,  the  recurrence  relations  for  the  coefficients 
are  not  particularly  easy  to  solve  in  closed  form.  Moreover,  applications  tend  to  demand 
understanding  the  behavior  of  solutions  at  the  singular  point  x0  =  0. 

Comparison  with  (11.88)  immediately  shows  that  x0  =  0  is  a  regular  singular  point, 
and  so  we  seek  solutions  in  Frobenius  form.  We  substitute  the  first,  second,  and  fourth 
expressions  in  (11.95)  into  the  Bessel  equation  and  then  equate  the  coefficients  of  the 
various  powers  of  x  to  zero.  The  lowest  power,  xr,  provides  the  indicial  equation 


r(r  —  1)  +  r  —  rri2  =  r2  —  m2  =  0. 


It  has  two  solutions,  r  =  =b  m,  except  when  m  —  0,  for  which  r  =  0  is  the  only  index. 

The  higher  powers  of  x  lead  to  recurrence  relations  for  the  coefficients  un  in  the 
Frobenius  series.  Replacing  m2  by  r2  produces 

Ui  =  0, 


^+1  . 

(r 

+ 

i)2- 

-  r2 

=  (2  r 

+  1)^1  —  o, 

rpr  +  2  . 

(r 

+ 

2)2  - 

-  r2 

u2 

+  1  = 

(4  r  +  4)u2  +  1  = 

0, 

rrT  +  3  . 

(r 

+ 

3)2  - 

-  r2 

u3 

+  u 

=  (6r  +  9)n3  +  ux 

=  0, 

and,  in  general, 

xr+n  : 

(r 

+ 

n)2  - 

_  j,2 

Un 

'  un- 

-2  =  n(2r  +  n)un 

+  Un 

Un  =  — 


Un  =  — 


4r  +  4 
u1 

6r  +  9 


=  0, 


Thus,  the  general  recurrence  relation  is 

1 


Un  = 


u„ 


n  —  2,  3, 4, ...  . 


Starting  with  uQ  =  1,  ux 
while  for  even  n  =  2/c, 


n(2r  +  n)  n  2’ 

0,  it  is  easy  to  deduce  that  all  u  —  0  for  all  odd  n 


(11.99) 
2fc  +  1, 


u 


u 


2k— 2 


U 


2k-4 


2k 


4 k(k -j- r)  16k(k  —  l)(r  +  k)(r  +  k  —  1) 

= _ tpt _ _ 

22k  k{k  —  1)  •  •  •  3  •  2  (r  +  k)(r  +  k  —  1)  •  •  •  (r  +  2)(r  +  1) 
We  have  thus  found  the  series  solution 

_  ( -l)kxr+2k 

2k  "  22k  k\  (r  +  k)(j 

k  =  0 


oo 


U 


(x)  = 


r-\-2k  _ 


oo 


k  =  0 


22k  kl  (r  +  k)(r  +  k  —  1)  •  •  •  (r  +  2 )(r  +  1) 


(11.100) 


So  far,  we  have  not  paid  attention  to  the  precise  values  of  the  indices  r  =  ±ra.  In 
order  to  continue  the  recurrence,  we  need  to  ensure  that  the  denominators  in  (11.99)  are 
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never  0.  Since  n  >  0,  a  vanishing  denominator  will  appear  whenever  2r  +  n  =  0,  and  so 
r  —  —  \n  is  either  a  negative  integer  —1,  —2,  —3, ...  or  half-integer  —  —  —  .  This 

will  occur  when  the  order  m  —  —r  —  is  either  an  integer  or  a  half-integer.  Indeed, 
these  are  precisely  the  situations  in  which  the  two  indices,  namely  r1  —  —m  and  r2  =  m, 
differ  by  an  integer,  r2  —  r1  —  n,  and  so  we  are  in  the  tricky  case  (m)  of  the  Frobenius 
method. 

There  is,  in  fact,  a  major  difference  between  the  integral  and  the  half-integral  cases. 
Recall  that  the  odd  coefficients  u2k+ x  =  0  in  the  Frobenius  series  automatically  vanish,  and 
so  we  only  have  to  worry  about  the  recurrence  relation  (11.99)  for  even  values  of  n.  When 
n  =  2 /c,  the  factor  2r  +  n  =  2(r  +  /c)  =  0  vanishes  only  when  r  —  —k  is  a  negative  integer; 
the  half-integral  values  do  not,  in  fact  cause  problems.  Therefore,  if  the  order  m  >  0  is  not 
an  integer,  then  the  Bessel  equation  of  order  m  admits  two  linearly  independent  Frobenius 
solutions,  given  by  the  expansions  (11.100)  with  exponents  r  =  +m  and  r  =  —m.  On  the 
other  hand,  if  m  is  an  integer,  there  is  only  one  Frobenius  solution,  namely  the  expansion 
(11.100)  for  the  positive  index  r  =  +ra.  The  Frobenius  recurrence  with  index  r  —  —m 
breaks  down,  and  the  second  independent  solution  must  include  a  logarithmic  term;  details 
appear  below. 

By  convention,  the  standard  Bessel  function  of  order  m  is  obtained  by  multiplying 
the  Frobenius  solution  (11.100)  with  r  =  m  by 


1  1 

— — - ,  or,  more  generally,  — — - —  ,  (11.101) 

2rn  rn !  2m  1  [m  +  1) 

where  the  first  factorial  form  can  be  used  if  m  is  a  nonnegative  integer,  while  the  more 
general  gamma  function  expression  must  be  employed  for  non-integral  values  of  m.  The 
result  is 


oo 


Jm(x)  =  Y 


(_ l)fca,m+2fe 

!_/  2 2k+m  k\  (m-\-k)\ 
k  =  0  V  ’ 


(11.102) 
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xrn  — 
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ra+2 
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ra+4 


X 


ra+6 


4(m+l)  32  (m+l)(m  +  2)  384  (m  +  l)(m  +  2)(m+3) 


+ 


When  m  is  non-integral,  the  (m  +  k) !  should  be  replaced  by  T(m  +  k  +  1),  and  m\  by 
r(m  +  1).  With  this  convention,  the  series  is  well  defined  for  all  real  m  except  when 
m  =  — 1,-2,— 3,...  is  a  negative  integer.  Actually,  if  m  is  a  negative  integer,  the  first 
m  terms  in  the  series  vanish,  because,  at  negative  integer  values,  T(—  n)  =  oo.  With  this 
convention,  one  can  prove  that 


J_m(x)  =  {-VmJm(x)>  m=  1,2,3,....  (11.103) 

A  simple  application  of  the  ratio  test  tells  us  that  the  power  series  converges  for  all 
(complex)  values  of  x,  and  hence  Jm(x)  is  everywhere  analytic.  Indeed,  the  convergence 
is  quite  rapid  when  x  is  of  moderate  size,  and  so  summing  the  series  is  a  reasonably  effec¬ 
tive  method  for  computing  the  Bessel  function  Jm(x)  —  although  in  serious  applications 
one  adopts  more  sophisticated  numerical  techniques  based  on  asymptotic  expansions  and 
integral  formulas,  [85,  86].  In  particular,  we  note  that 


Jo(0)  =  l,  Jm(0)  =  0,  m  >  0.  (11.104) 

Figure  11.4  displays  graphs  of  the  first  four  Bessel  functions  for  0  <  x  <  20;  the  vertical 
axes  range  from  —.5  to  1.0.  Most  software  packages,  both  symbolic  and  numeric,  include 
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Figure  11.4.  Bessel  functions. 


routines  for  accurately  evaluating  and  graphing  Bessel  functions,  and  their  properties  can 
be  regarded  as  well  known. 

■j 

Example  11.8.  Consider  the  Bessel  equation  of  order  m  =  There  are  two  indices, 
r  =  ±  and  the  Frobenins  method  yields  two  independent  solutions:  J1/2(oc)  and  J_  i/2(x)- 

For  the  first,  with  r  =  |,  the  recurrence  relation  (11.99)  takes  the  form 


u 


u 


n  —  2 


(n  +  1)  n 

Starting  with  u0  =  1  and  u1  =  0,  the  general  formula  is  easily  found  to  be 


(-i) 


k 


un  —  \  (n  +  1) ! 


0 


n  =  2  k  even. 


n  =  2  k  +  1  odd. 


Therefore,  the  resulting  solution  is 


oo 


u(x)  =  \fx 


(-1) 


k 


feVc,  ( 2k+ !)! 


x2k  = 


OO 


E 


x  2-— '  (2fc  +  1) ! 
k  =  0  V  7 


(~1)fc  2k+i  _ 


According  to  (11.101),  the  Bessel  function  of  order  |  is  obtained  by  dividing  this  function 
by 


^r(§)  =  Vi 


where  we  used  (11.66)  to  evaluate  the  gamma  function  at  §.  Therefore, 


sin  a; . 


Jl/2(x) 


(11.105) 
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Similarly,  for  the  other  index  r  =  —  |,  the  recurrence  relation 


leads  to  the  formula 


Un  = 


u 


Un  =  ~ 


n  —  2 


n  (n  —  1) 


(-i)fc 


n\ 


0 


n  —  2k  even, 
n  =  2  k  +  1  odd. 


for  its  coefficients,  corresponding  to  the  solution 


u(x)  =  x 


-  ^-1/2 


oo 


E 

fc  =  0 


k 


(-1) 

(2fc) ! 


x2k  = 


cosx 


Therefore,  in  view  of  (11.101)  and  (11.65),  the  Bessel  function  of  order  —  i  is 


J 


—  1/2 


(x)  = 


V2 


cosx 


r(b 


7 TX 


COS  X  . 


(11.106) 


As  we  noted  above,  if  m  is  not  an  integer,  the  two  independent  solutions  to  the  Bessel 
equation  of  order  m  are  Jm(x)  and  J_m(x).  However,  when  m  is  an  integer,  (11.103) 
implies  that  these  two  solutions  are  constant  multiples  of  each  other,  and  so  one  must  look 
elsewhere  for  a  second  independent  solution.  One  method  is  to  use  a  generalized  Frobenius 
expansion  involving  a  logarithmic  term,  i.e.,  (11.92)  when  m  —  0  (see  Exercise  11.3.33) 
or  (11.93)  when  m  >  0.  A  second  approach  is  to  employ  the  reduction  procedure  used  in 
Example  11.7.  Yet  another  option  relies  on  the  following  limiting  procedure;  see  [85,  119] 
for  full  details. 

Theorem  11.9.  If  m  >  0  is  not  an  integer ,  then  the  Bessel  functions  Jm(x)  and 
J_m(x)  provide  two  linearly  independent  solutions  to  the  Bessel  equation  of  order  m.  On 
the  other  hand ,  if  m  =  0,1,  2,  3,...  is  an  integer,  then  a  second  independent  solution, 
traditionally  denoted  hy  Ym(x)  and  called  the  Bessel  function  of  the  second  kind  of  order 
m,  can  be  found  as  a  limiting  case 


Yrn(x)  =  lim 

v  — y  m 


Jv  {x)  COS  V  7T  —  J _  v  {x) 


sm  UTT 


(11.107) 


of  a  certain  linear  combination  of  Bessel  functions  of  non-integral  order  v. 


With  some  further  analysis,  it  can  be  shown  that  the  Bessel  function  of  the  second 
kind  of  order  m  has  the  logarithmic  Frobenius  expansion 


oo 


Ym(x)  =  -  (7  +  lQg  |  )  Jm(x)  +  Y  bk 


X 


2k  —  rn 


k  =  0 


m  —  0,1,  2, 


(11.108) 


with  coefficients 


h  =  < 


(m  —  k  —  1) ! 
n22  k-mkl  ’ 

(_i)fc-m-i  {hk_m 


+  h-k) 


7 r  22k  171  k !  (k  —  rn) ! 


0  <  k  <  m  —  1 


k  >  rn, 
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Y2(x)  Y3(x) 

Figure  11.5.  Bessel  functions  of  the  second  kind. 


where 


K  = 


,  i  i 

^  =  1+2  +  3  + 


1 


k  >  0, 


while 


7  =  lim  (hk  —  log  k)  ~  .5772156649  . . . 

k—>  oo  v  7 


(11.109) 


is  known  as  the  Euler  or  Euler- M as cheroni  constant.  All  Bessel  functions  of  the  second 
kind  have  a  singularity  at  the  origin  x  =  0;  indeed,  by  inspection  of  (11.108),  we  find  that 
the  leading  asymptotics  as  x  — 0  are 


Y0(x)  ~  ^  log®,  Ym{x) - 2  ^ ,  m>  0.  (11.110) 

7T  nx171 

Figure  11.5  contains  graphs  of  the  first  four  Bessel  function  of  the  second  kind  on  the 
interval  0  <  x  <  20;  the  vertical  axis  ranges  from  —1  to  1. 

Finally,  we  show  how  Bessel  functions  of  different  orders  are  interconnected  by  two 
important  recurrence  relations. 


Proposition  11.10.  The  Bessel  functions  are  related  by  the  following  formulae : 


dJryy,  XTl  ,  \  ,  s  j ^  UT  T  /  \  T  /  \  /  1  1  1  1  1  \ 

7  f  (X)  ~  ^rn-l(X)  j  —  j  ^  “  ^rn(X)  ~  ^rn+l(X) '  (  '  ) 


Proof:  Differentiating  the  power  series 


°°  / _ -|\fc™2ra+2fc 
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k  =  0  v  ' 
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produces 
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dx 


oo 


xmJm(x)]  = 
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=  X 
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(— l)k  2  (m  +  k)x2rn+2k  1 
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(11.112) 
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m  —  1 
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Expansion  of  the  left-hand  side  of  this  formula  leads  to 


d  T  d 

xm  __rn  _|_  mxrn~1Jm(x)  =  —  [x™  Jm(x)  ]  =  X™  Jrn_1(x), 


dx 


dx 


which  establishes  the  first  recurrence  formula  (11.111).  The  second  is  proved  by  a  similar 
manipulation  involving  differentiation  of  Jm(x).  Q.E.D. 

For  example,  using  the  second  recurrence  formula  (11.111)  along  with  (11.105),  we 
can  write  the  Bessel  function  of  order  |  in  elementary  terms: 


^3/2 (X)  — 


dJ1/2(x)  1 

i  _  x 


2 
7 r 


dx  2x 

cos  x  sin  x 


0) 
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V 2  2x3/2 


+  A/  — 


2  sinx 

7T  2x3/2 


2  sin  x  —  x  cos  x 

77  X 


(11.113) 


3/2 


Iterating,  one  concludes  that  Bessel  functions  of  half-integral  order,  m  =  ±|,±|,±|,  ..., 
are  all  elementary  functions,  in  that  they  can  be  written  in  terms  of  trigonometric  func¬ 
tions  and  powers  of  y/x .  We  will  make  use  of  these  functions  in  our  treatment  of  the 
three-dimensional  heat  and  wave  equations  in  spherical  geometry.  On  the  other  hand,  all 
of  the  other  Bessel  functions  are  non-element  ary  special  functions. 

With  this,  we  conclude  our  brief  introduction  to  the  method  of  Frobenius  and  the 
basics  of  Bessel  functions.  The  reader  interested  in  delving  further  into  either  the  general 
method  or  the  host  of  additional  properties  of  Bessel  functions  is  encouraged  to  consult  a 
more  specialized  text,  e.g.,  [59,  85,  119]. 


Exercises 


11.3.22.  Consider  the  ordinary  differential  equation  2xu"  +  u  +  xu  =  0.  (a)  Prove  that  x  =  0 
is  a  regular  singular  point,  (b)  Find  two  independent  series  solutions  in  powers  of  x. 


C  11.3.23.  Consider  the  differential  equation 


u 


// 


u 


(a)  Classify  all  x0  e  M  as  either  a 


2  —  x  xz 

{i)  regular  point;  {ii)  regular  singular  point;  and/or  (in)  irregular  singular  point.  Explain 
your  answers,  (b)  Find  a  series  solution  to  the  equation  based  at  the  point  xQ  =  0,  or 
explain  why  none  exists.  What  is  the  radius  of  convergence  of  your  series? 


11.3.24.  Consider  the  differential  equation  u 


// 


^1  —  — ^  u  u  —  0. 


(a)  Classify  all  x0  E  M  as  either  (i)  a  regular  point;  (ii)  a  regular  singular  point; 
(Hi)  an  irregular  singular  point;  (iv)  none  of  the  above.  Explain  your  answers. 

(b)  Write  out  the  first  five  nonzero  terms  in  a  series  solution. 
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11.3.25.  Consider  the  differential  equation  Axu'  +  2 u  +  u  =  0.  (a)  Classify  the  values  of  x  for 
which  the  equation  has  regular  points,  regular  singular  points,  and  irregular  singular  points. 

(b)  Find  two  independent  series  solutions,  in  powers  of  x.  For  what  values  of  x  do  your 
series  converge?  (c)  By  inspection  of  your  series,  write  the  general  solution  to  the  equation 
in  terms  of  elementary  functions. 

C  11.3.26.  The  Chebyshev  differential  equation  is  (1  —  x2)u"  —  xu  +  m2u  =  0.  (a)  Find  all 
(i)  regular  points;  (ii)  regular  singular  points;  (in)  irregular  singular  points,  (b)  Show 
that  if  m  is  an  integer,  the  equation  has  a  polynomial  solution  of  degree  m,  known  as  a 
Chebyshev  polynomial.  Write  down  the  Chebyshev  polynomials  of  degrees  1,  2,  and  3. 

(c)  For  m  —  1,  find  two  linearly  independent  series  solutions  based  at  the  point  x0  =  1. 

11.3.27.  Write  the  following  Bessel  functions  in  terms  of  elementary  functions: 

(a)  J5/2(x),  (b)  J7/2(x),  ( c )  J_ 3/2 (a). 

11.3.28.  Prove  the  identity  (11.103). 

11.3.29.  Suppose  that  u(x)  solves  Bessel’s  equation,  (a)  Find  a  second  order  ordinary  differen¬ 
tial  equation  satisfied  by  the  function  w(x)  =  yT  u(x).  (b)  Use  this  result  to  rederive  the 
formulas  for  J1/2(x)  and 

0  11.3.30.  Let  m  >  0  be  real,  and  consider  the  modified  Bessel  equation  of  order  m: 

x2  u"  +  x  u  —  ( x 2  +  m2)  u  =  0.  (11.114) 


(a)  Explain  why  x0  =  0  is  a  regular  singular  point. 

(b)  Use  the  method  of  Frobenius  to  construct  a  series  solution  based  at  x0 
relate  your  solutions  to  the  Bessel  function  J  (x)l 


0.  Can  you 


0  11.3.31.  (a)  Let  a,  6,  c  be  constants  with  6,  c  7^  0.  Show  that  the  function  u(x)  =  x  aj0(b  xc) 
solves  the  ordinary  differential  equation 


2  d  U  ^  x  du  /,2  2  2  c  2\ 

x  — — ft  +  (1  —  2  a)x  — — h  (box  -\~a)u 

dx 


dx 2 


0. 


What  is  the  general  solution  to  this  equation? 

(b)  Find  the  general  solution  to  the  ordinary  differential  equation 


2  d2u  du  ,  2c  ,  ,  n 

x  —it  +  ax  - — |-  ipx  +  7)  u  =  0, 
dxz  dx 


for  constants  a,  / 3 ,  7,  c  with  ftc/  0. 

C  11.3.32.  Let  k  >  0  be  a  constant.  The  ordinary  differential  equation 


the  vibrations  of  a  weakening  spring  whose  stiffness  k(t)  —  e~2t  is  exponentially  decaying 
in  time,  (a)  Show  that  this  equation  can  be  solved  in  terms  of  Bessel  functions  of  order  0. 
Hint :  Perform  a  change  of  variables,  (b)  Does  the  solution  tend  to  0  as  t  00? 

C  11.3.33.  We  know  that  u(x)  =  J$(x)  is  a  solution  to  the  Bessel  equation  of  order  0,  namely 

xuf  -\- u  -\- xu  =  0.  (11.115) 


d2u 

df2 


.  —2 1 
+  e  u 


0  describes 


In  accordance  with  the  general  Frobenius  method,  construct  a  second  solution  of  the  form 

00 

u(x)  =  Jq  (x)  log  x  +  E  Vnxn. 

n—  1 


11.3.34.  Is  it  possible  to  have  all  solutions  to  an  ordinary  differential  equation  bounded  at  a 
regular  singular  point?  If  not,  explain  why  not.  If  true,  give  an  example  where  this 
happens. 


474 


11  Dynamics  of  Planar  Media 


11.4  The  Heat  Equation  in  a  Disk,  Continued 


Now  that  we  have  acquired  some  familiarity  with  the  solutions  to  Bessel’s  ordinary  differ¬ 
ential  equation,  we  are  ready  to  analyze  the  separable  solutions  to  the  heat  equation  in  a 
polar  geometry.  At  the  end  of  Section  11.2,  we  were  left  with  the  task  of  solving  the  Bessel 
equation  (11.58)  of  integer  order  m.  As  we  now  know,  there  are  two  independent  solutions, 
namely  the  Bessel  function  of  the  first  kind  Jm,  (11.102),  and  the  more  complicated  Bessel 
function  of  the  second  kind  Tm,  (11.107),  and  hence  the  general  solution  has  the  form 

P(Z)  =  C1J miZ)  +  C2YmiZ) , 

for  constants  c1,c2.  Reverting  to  our  original  radial  coordinate  r  =  zj  \f\ ,  we  conclude 
that  every  solution  to  the  radial  equation  (11.56)  has  the  form 

p(r)  =  +C2Ym{^r)- 

Now,  the  singular  point  r  =  0  represents  the  center  of  the  disk,  and  the  solutions  must 
remain  bounded  there.  While  this  is  true  for  Jm(z),  the  second  Bessel  function  Ym(z )  has, 
according  to  (11.110),  a  singularity  at  z  =  0  and  so  is  unsuitable  for  the  present  purposes. 
(On  the  other  hand,  it  plays  a  role  in  other  situations,  e.g.,  the  heat  equation  on  an  annular 
ring.)  Thus,  every  separable  solution  that  is  bounded  at  r  =  0  comes  from  the  rescaled 
Bessel  function  of  the  first  kind  of  order  m: 


P(r)  =  Jmi^  r)  ■  (11.116) 

The  Dirichlet  boundary  condition  at  the  disk’s  rim  r  —  1  requires 

Pi1)  =  Jmi )  =  °- 

Therefore,  in  order  that  A  be  a  bona  fide  eigenvalue,  y/~X  must  be  a  root  of  the  mth  order 
Bessel  function  J  . 

lib 

Remark :  We  already  know,  thanks  to  the  positive  definiteness  of  the  Dirichlet  bound¬ 
ary  value  problem,  that  the  Helmholtz  eigenvalues  must  all  be  positive,  A  >  0,  and  so  there 
will  be  no  difficulty  in  taking  its  square  root. 

The  graphs  of  Jm(z)  strongly  indicate,  and,  indeed,  it  can  be  rigorously  proved, 
85,  119],  that  as  £  increases  above  0,  each  Bessel  function  oscillates,  with  slowly  de¬ 
creasing  amplitude,  between  positive  and  negative  values.  In  fact,  asymptotically, 


(11.117) 


and  so  the  oscillations  become  essentially  the  same  as  a  (phase-shifted)  cosine  whose  am¬ 
plitude  decreases  like  z~x!2 .  As  a  consequence,  there  exists  an  infinite  sequence  of  Bessel 
roots ,  which  we  number  in  increasing  order: 


JmiCm,n)  =  °>  where 

0  <  Cm,  1  <  Cm, 2  <  Cm, 3  <  '  '  '  with  Cm.n  >  00  aS  n  >  °°- 


(11.118) 


It  is  worth  emphasizing  that  the  Bessel  functions  are  not  periodic,  and  so  their  roots 
are  not  evenly  spaced.  However,  as  a  consequence  of  (11.117),  the  large  Bessel  roots  are 
asymptotically  close  to  the  evenly  spaced  roots  of  the  shifted  cosine: 


C m,n  ~  {n+\m-\)i r 


as 


n 


»  oo. 


(11.119) 
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Owing  to  their  physical  importance  in  a  wide  range  of  problems,  the  Bessel  roots  have 
been  extensively  tabulated.  The  accompanying  table  displays  all  Bessel  roots  that  are  <12 
in  magnitude.  The  columns  of  the  table  are  indexed  by  m,  the  order  of  the  Bessel  function, 
and  the  rows  by  n,  the  root  number. 

Table  of  Bessel  Roots  C  „ 

i  /  L  •  f  L 


Remark :  According  to  (11.102), 


Jm( 0)  =  0  for  nn  >  0,  while  Jo(0)  =  1. 

However,  we  do  not  count  0  as  a  bona  fide  Bessel  root,  since  it  does  not  lead  to  a  valid 
eigenfunction  for  the  Helmholtz  boundary  value  problem. 


Summarizing  our  progress  so  far,  the  eigenvalues 

Am,n  =  Cm,n>  n  =  1,2,3,...,  771  =  0,1,2,...,  (11.120) 

of  the  Bessel  boundary  value  problem  (11.56-57)  are  the  squares  of  the  roots  of  the  Bessel 
function  of  order  m.  The  corresponding  eigenfunctions  are 


^ra,n(0  (Cn,n  0  ’  ^  1,2,3,...,  Vfl  0,1,2,...,  (11.121) 

defined  for  0  <  r  <  1.  Combining  (11.121)  with  the  formula  (11.55)  for  the  angular  com¬ 
ponents,  we  conclude  that  the  separable  solutions  (11.53)  to  the  polar  Helmholtz  boundary 
value  problem  (11.51)  are 

«0,n(r)  =  Jo(Co,„< 

vmn(r,  Q)  =  <(Cm,nr)  cosmO,  where  m,  n  =  1,2,3, ...  .  (11.122) 

«m,nM)  =  Jm((m,nr)  sin  m0, 

These  solutions  define  the  normal  modes  for  the  unit  disk;  Figure  11.6  plots  the  first  few  of 
them.  The  eigenvalues  A0  n  are  simple,  and  contribute  radially  symmetric  eigenfunctions, 
whereas  the  eigenvalues  Am  n  for  m  >  0  are  double,  and  produce  two  linearly  independent 
separable  eigenfunctions,  with  trigonometric  dependence  on  the  angular  variable. 

Recalling  the  original  ansatz  (11.50),  we  have  at  last  produced  the  basic  separable 
eigensolutions 

W0)«  =  e~  C2°'nt  V0,n(r)  =  e_C°2’ni  Jo(Co,n< 
um,ne,r,0)  =  e~<”-"tvm>n(r,0)  =  Jm{Cm,nr)  cosm6»,  (11.123) 

=  e-^^tvm  n{r,0)  =  e_c™'"4  Jm(Cm,„r)  sin m(9, 


m,  n  =  1,  2,  3, . . .  , 
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^3,1  ^3,2 

Figure  11.6.  Normal  modes  for  a  disk. 


to  the  homogeneous  Dirichlet  boundary  value  problem  for  the  heat  equation  on  the  unit 
disk.  The  general  solution  is  obtained  by  linear  superposition,  in  the  form  of  an  infinite 
series 


u 


oo 


oo 


aO,nUo,n(t’r)+  T 


a 


m. 


,,n  °)  +  bm,n  r,  0)  ]  ,  (11.124) 

n  —  1  m,n  =  1 

where  the  initial  factor  of  \  is  included,  as  with  ordinary  Fourier  series,  for  later  conve- 
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nience.  As  usual,  the  coefficients  aw  „  are  determined  by  the  initial  condition 

'  /  /  i  5  ll J  ill  ,  il 


u 


oo 


oo 


=  -  J2  a0,nv0,n(r)+  N 


a 


rn 


n  —  1 


m,n  =  1 


,n  Vm,n(r ’  9)  +  bm,n  ^m,nX  9)  ]  =  fX  9) • 

(11.125) 


This  requires  that  we  expand  the  initial  data  into  a  Fourier-Bessel  series  in  the  eigen¬ 
functions.  As  before,  it  is  possible  to  prove,  [34],  that  the  separable  eigenfunctions  are 
complete  —  there  are  no  other  eigenfunctions  —  and  hence  every  (reasonable)  function 
defined  on  the  unit  disk  can  be  written  as  a  convergent  series  in  the  Bessel  eigenfunctions. 


Theorem  9.33  gurantees  that  the  eigenfunctions  are  orthogonal^  with  respect  to  the 
standard  L2  inner  product 


(u,v)=  //  u(x,y)  v(x,y)  dx  dy  =  /  /  u(r,6)  v(r,6)  r  dO  dr 

J  J  D  J  0  J  —  7T 

on  the  unit  disk.  (Note  the  extra  factor  of  r  coming  from  the  polar  coordinate  form  of 
the  area  element  dx  dy  =  rdrdO.)  The  L2  norms  of  the  Fourier-Bessel  eigenfunctions  are 
given  by  the  interesting  formulae 


v 


m,n 


(11.126) 


which  involve  the  value  of  the  Bessel  function  of  the  next-higher  order  at  the  appropriate 
Bessel  root.  A  proof  of  (11.126)  can  be  found  in  Exercise  11.4.22,  while  numerical  values 
are  provided  in  the  accompanying  table. 


Norms  of  the  Fourier-Bessel  Eigenfunctions 


1  V 

— 

V 

II  m,n 

m,n 

\  m 
n  \ 

0 

1 

2 

3 

4 

5 

6 

7 

1 

.9202 

.5048 

.4257 

.3738 

.3363 

.3076 

.2847 

.2658 

2 

.6031 

.3761 

.3401 

.3126 

.2906 

.2725 

.2572 

.2441 

3 

.4811 

.3130 

.2913 

.2736 

.2586 

.2458 

.2347 

.2249 

4 

.4120 

.2737 

.2589 

.2462 

.2352 

.2255 

.2169 

.2092 

5 

.3661 

.2462 

.2353 

.2257 

.2171 

.2095 

.2025 

.1962 

Orthogonality  of  the  eigenfunctions  implies  that  the  coefficients  in  the  Fourier-Bessel 


^  For  the  two  independent  eigenfunctions  corresponding  to  one  of  the  double  eigenvalues, 
orthogonality  must  be  verified  by  hand,  but,  in  this  case,  it  follows  easily  from  the  orthogonality 
of  their  trigonometric  components. 
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series  (11.125)  are  given  by  the  inner  product  formulae 
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0  ,n 


=  2 


f,V 


0  ,n 
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0  ,n 


ar _ = 


(f,v. 


rri,n 


n  ^l(Co,n)2  Jo  J  -7T 

■1  p7T 


[  [  f(r,8)J0(Co,nr)  rffldr, 

JO  J -TV 


m,n 


Vn 


5  = 


m,n 

(f,v 


nr  J  ( ")2 

m+lvSm,n/ 


m,n 


m,n 
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m,n 


^  (Cm,n) 


0  J  —  7T 
■  1  r-TT 

0  J  —  IT 


fir, 6)  Jm{Cm,nr)  r  cosmO  d9  dr,  (11.127) 
f(r,9)  Jm((m,nr )  rsmm0d9dr. 


In  accordance  with  the  general  theory,  each  individual  separable  solution  (11.123)  to 
the  heat  equation  decays  exponentially  fast,  at  a  rate  Am  n  =  Cm  n  prescribed  by  the  square 
of  the  corresponding  Bessel  root.  In  particular,  the  dominant  mode,  meaning  the  one  that 
persists  the  longest,  is 

>A(t,r,e)  =  e_coV  J0(Co,iC-  (11.128) 


u 


0, 


Its  decay  rate  is  prescribed  by  the  smallest  positive  eigenvalue: 


c, 


2 

0,1 


5.783. 


(11.129) 


which  is  the  square  of  the  smallest  root  of  the  Bessel  function  JQ(z).  Since  J0  (*)  >  0  for 
0  <  £  <  C0  1  ,  the  dominant  eigenfunction  v0  x  (r,  0)  =  J0  (£0  1  r)  >  0  is  radially  symmet¬ 
ric  and  strictly  positive  within  the  entire  disk.  Consequently,  for  most  initial  conditions 
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(specifically  those  for  which  a0  x  ^  0),  the  disk’s  temperature  distribution  eventually  be¬ 
comes  entirely  of  one  sign  and  radially  symmetric,  while  decaying  exponentially  fast  to  zero 
at  the  rate  given  by  (11.129).  See  Figure  11.7  for  a  plot  of  a  typical  solution.  Note  how, 
in  accordance  with  the  theory,  the  solution  soon  acquires  a  radial  symmetry  as  it  decays 
to  thermal  equilibrium. 


Exercises 

11.4.1.  At  the  initial  time  =  0,  a  concentrated  unit  heat  source  is  instantaneously  applied  at 
position  x  =  y  =  0,  to  a  circular  metal  disk  of  unit  radius  and  unit  thermal  diffusivity 

whose  outside  edge  is  held  at  0°.  Write  down  an  eigenfunction  series  for  the  resulting  tem¬ 
perature  distribution  at  time  t  >  0.  Hint :  Be  careful  working  with  the  delta  function  in 
polar  coordinates;  see  Exercise  6.3.6. 

11.4.2.  Solve  Exercise  11.4.1  when  the  concentrated  unit  heat  source  is  instantaneously  applied 
at  the  center  of  the  disk. 

T  11.4.3.  (a)  Write  down  the  Fourier-Bessel  series  for  the  solution  to  the  heat  equation  on  a  unit 
disk  with  7  =  1,  whose  circular  edge  is  held  at  0°  and  subject  to  the  initial  conditions 
u(0,ay  y)  =  1  for  x2  +  y2  <  1.  Hint :  Use  (11.112)  to  evaluate  the  integrals  for  the 
coefficients,  (b)  Approximate  the  time  t*  >  0  after  which  the  temperature  of  the  disk  is 
everywhere  <  .5°. 

X  11.4.4.  (a)  Write  down  the  first  three  nonzero  terms  in  the  Fourier-Bessel  series  for  the  solution 
to  the  heat  equation  on  a  unit  disk  with  7  =  1  whose  circular  edge  is  held  at  0°  subject  to 
the  initial  conditions  u(0,r,  #)  =  1  —  r  for  r  <  1.  Use  numerical  integration  to  evaluate  the 
coefficients,  (b)  Use  your  approximation  to  determine  at  which  times  t  >  0  the  tempera¬ 
ture  of  the  disk  is  everywhere  <  .5°. 

11.4.5.  Prove  that  every  separable  eigenfunction  of  the  Dirichlet  boundary  value  problem  for 
the  Helmholtz  equation  in  the  unit  disk  can  be  written  in  the  form 

c  Jm (Cm  nr)  cos(m$  —  &)  f°r  fixed  c  ^  0  and  —  tt  <  a  <  tt. 

11.4.6.  Suppose  the  initial  data  f(r,0)  in  (11.49)  satisfies  J  /(r,  0)  J0(  £0  ±r)  r  dO  dr  —  0. 

(a)  What  is  the  decay  rate  to  equilibrium  of  the  resulting  heat  equation  solution  u(t,  r,  0)1 

(b)  Prove  that,  generically,  the  asymptotic  temperature  distribution  has  half  the  disk  above 
the  equilibrium  temperature  and  the  other  half  below.  Can  you  predict  the  diameter  that 
separates  the  two  halves?  (c)  If  you  know  that  a0  x  =  0,  and  also  that  the  long-time 

temperature  distribution  is  radially  symmetric,  what  is  the  (generic)  decay  rate?  What  is 
the  asymptotic  temperature  distribution? 

0  11.4.7.  Show  how  to  use  a  scaling  symmetry  to  solve  the  heat  equation  in  a  disk  of  radius  R 
knowing  the  solution  in  a  disk  of  radius  1. 

11.4.8.  Use  rescaling,  as  in  Exercise  11.4.7,  to  produce  the  solution  to  the  Dirichlet  initial¬ 
boundary  value  problem  for  a  disk  of  radius  2  with  diffusion  coefficient  7  =  5. 

11.4.9.  If  it  takes  a  disk  of  unit  radius  3  minutes  to  reach  (approximate)  thermal  equilibrium, 
how  long  will  it  take  a  disk  of  radius  2  made  out  of  the  same  material  and  subject  to  the 
same  homogeneous  boundary  conditions  to  reach  equilibrium? 

11.4.10.  Assuming  Dirichlet  boundary  conditions,  does  a  square  or  a  circular  disk  of  the  same 
area  reach  thermal  equilibrium  faster?  Use  your  intuition  first,  and  then  check  using  the 
explicit  formulas. 
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11.4.11.  Answer  Exercise  11.4.10  when  the  square  and  circle  have  the  same  perimeter. 

11.4.12.  Which  reaches  thermal  equilibrium  faster:  a  disk  whose  edge  is  held  at  0°  or  a  disk  of 
the  same  radius  that  is  fully  insulated? 

11.4.13.  A  circular  metal  disk  is  removed  from  an  oven  and  then  fully  insulated. 

True  or  false:  (a)  The  eventual  equilibrium  temperature  is  constant. 

(b)  For  large  t  0,  the  temperature  u(t,x,y)  becomes  more  and  more  radially  symmetric. 
If  false,  what  can  you  say  about  the  temperature  profile  at  large  times? 

C  11.4.14.  (a)  Write  down  an  eigenfunction  series  formula  for  the  temperature  dynamics  of  a  disk 
of  radius  1  that  has  an  insulated  boundary,  (b)  What  is  the  eventual  equilibrium  temper¬ 
ature?  (c)  Is  the  rate  of  decay  to  thermal  equilibrium  (z)  faster,  (zz)  slower,  or  (in)  the 
same  as  a  disk  with  Dirichlet  boundary  conditions? 

C  11.4.15.  Write  out  a  series  solution  for  the  temperature  in  a  half-disk  of  radius  1,  subject  to 
(a)  homogeneous  Dirichlet  boundary  conditions  on  its  entire  boundary;  (b)  homogeneous 
Dirichlet  conditions  on  the  circular  part  of  its  boundary  and  homogeneous  Neumann  con¬ 
ditions  on  the  straight  part,  (c)  Which  of  the  two  boundary  conditions  results  in  a  faster 
return  to  equilibrium  temperature?  How  much  faster? 

11.4.16.  A  large  sheet  of  metal  is  heated  to  100°.  A  circular  disk  and  a  semi-circular  half-disk 
of  the  same  radius  are  cut  out  of  it.  Their  edges  are  then  held  at  0°,  while  being  fully  insu¬ 
lated  from  above  and  below. 

(a)  True  or  false:  The  half-disk  goes  to  thermal  equilibrium  twice  as  fast  as  the  disk. 

(b)  If  you  need  to  wait  20  minutes  for  the  circular  disk  to  cool  down  enough  to  be  picked  up 
in  your  bare  hands,  how  long  do  you  need  to  wait  to  pick  up  the  semi-circular  disk? 

£  11.4.17.  Two  identical  plates  have  the  shape  of  an  annular  ring  {1  <  r  <  2}  with  inner  radius 
1  and  outer  radius  2.  The  first  has  an  insulated  inner  boundary  and  outer  boundary  held 
at  0°,  while  the  second  has  an  insulated  outer  boundary  and  inner  boundary  held  at  0°.  If 
both  start  out  at  the  same  temperature,  which  reaches  thermal  equilibrium  faster? 

Quantify  the  rates  of  decay. 


T  11.4.18.  Let  m  >  0  be  a  nonnegative  integer.  In  this  exercise,  we  investigate  the  completeness 
of  the  eigenfunctions  of  the  Bessel  boundary  value  problem  (11.56-57).  To  this  end,  define 
the  Sturm-Liouville  linear  differential  operator 


S[u] 


1  d 


x  dx 


x 


du 

dx 


rn 

T  — q-  u. 
xA 


subject  to  the  boundary  conditions  |i/(0)  |  <  oo,  u(  1)  =  0,  and  either  \u(0)  \  <  oo  when 
m  =  0,  or  u(0)  =  0  when  m  >  0. 

(a)  Show  that  S  is  self-adjoint  relative  to  the  inner  product  (  /  ,  g)  =  /  f(x)  g(x)  x  dx. 

J  o 

(b)  Prove  that  the  eigenfunctions  of  S  are  the  rescaled  Bessel  functions  dm(Cm  nx)  f°r 
n  =  1,2,3,...  .  What  are  the  orthogonality  relations? 

(c)  Find  the  Green’s  function  G(x\$f)  and  modified  Green’s  function  G(x;£),  cf.  (9.59),  asso¬ 
ciated  with  the  boundary  value  problem  S[u]  =  0. 

(d)  Use  the  criterion  of  Theorem  9.47  to  prove  that  the  eigenfunctions  are  complete. 

11.4.19.  Determine  the  Bessel  roots  Qy2  n*  Do  they  satisfy  the  asymptotic  formula  (11.119)? 

£  11.4.20.  Use  a  numerical  root  finder  to  compute  the  first  10  Bessel  roots  (3/2  n,  n  =  1, 10- 
Compare  your  values  with  the  asymptotic  formula  (11.119). 


11.4.21.  Prove  that  drm_-^((^m  n)  ^771+ 1  (£771,71)* 

^  11.4.22.  In  this  exercise,  we  prove  formula  (11.126). 

(a)  First,  use  the  recurrence  formulae  (11.111)  to  prove 

^  ^m—  1  ^777+1 


2  xJm(x)- 


(b)  Integrate  both  sides  of  the  previous  formula  from  0  to  the  Bessel  zero  and  then 
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use  Exercise  11.4.21  to  show  that 

./■  P* 

Sm,n  T  /  \  2  ->  S' 


x  Jrn(x)  dx 


m,n 


^ m— 1  (Cm,n)  (Cm,,n) 
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m,n 


J, 


' o  '  2 

(c)  Next,  use  a  change  of  variables  to  establish  the  identity 

Jq  2  ^m+1  (Cm,n)  * 

(d)  Finally,  use  the  formulae  for  a  and  v  to  complete  the  proof  of  (11.126). 


(C  )2 

\jttl ,n  )  ’ 


m+1  VSm,n 


(}  11.4.23.  Prove  directly  that  the  eigenfunctions  vrn  n(r,  0)  and  vrn  n(r,  0)  in  (11.122)  are  orthog- 

o 

onal  with  respect  to  the  L  inner  product  on  the  unit  disk. 

11.4.24.  Establish  the  following  alternative  formulae  for  the  eigenfunction  norms: 
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11.5  The  Fundamental  Solution  to  the  Planar  Heat  Equation 

As  we  learned  in  Section  8.1,  the  fundamental  solution  to  the  heat  equation  measures 
the  temperature  distribution  resulting  from  a  concentrated  initial  heat  source,  e.g.,  a  hot 
soldering  iron  applied  instantaneously  at  a  single  point  on  a  metal  plate.  The  physical 
problem  is  modeled  mathematically  by  taking  a  delta  function  as  the  initial  data  along 
with  the  relevant  homogeneous  boundary  conditions.  Once  the  fundamental  solution  is 
known,  one  is  able  to  use  linear  superposition  to  recover  the  solution  generated  by  any 
other  initial  data. 

As  in  our  one-dimensional  analysis,  we  shall  concentrate  on  the  most  tractable  case, 
in  which  the  domain  is  the  entire  plane:  12  =  IR2.  Thus,  our  first  goal  is  to  solve  the  initial 
value  problem 

=7Au,  u(0,  x,  y)  =  5 {pc  —  £)  5(y  —  77),  (11.130) 

for  t  >  0  and  (x,y)  E  IR2.  The  solution  u  =  F(t,  x;£)  =  F(t,  x,  y\  £,  77)  to  this  initial  value 
problem  is  known  as  the  fundamental  solution  for  the  heat  equation  on  IR2. 

The  quickest  route  to  the  desired  formula  relies  on  the  following  means  of  combining 
solutions  of  the  one-dimensional  heat  equation  to  produce  solutions  of  the  two-dimensional 
version. 

Lemma  11.11.  Let  v(t,x)  and  w(t,x)  be  any  two  solutions  to  the  one-dimensional 
heat  equation  ut  =  7  uxx.  Then  their  product 

u(t,  x,y)  =  v(t,  x)  w(t,y)  (11.131) 

is  a  solution  to  the  two-dimensional  heat  equation  ut—  7  ( uxx  +  u  ). 

Proof :  Our  assumptions  imply  that  vt  =  yvxx,  while  wt  =  77a  when  we  write 
w(t,y)  as  a  function  of  t  and  y.  Therefore,  differentiating  (11.131),  we  find 

du  dv  dw  d2v  d2w  (  d2u  d2u  \ 

Ut  =  at w  +  v  ~ot  =  W  +  1V  yy  =  7  +  op  J  > 

and  hence  u(t,x,  y )  solves  the  two-dimensional  heat  equation. 


Q.E.D. 
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For  example,  if 


x(t,x)  —  e  1CX  1  sin  ax,  w{t,y)  =  e  7/5  t  sin/3?/, 

are  separable  solutions  of  the  one-dimensional  heat  equation,  then 

?z(t,  x,  ?/)  =  e_7^a  ^ 1  sinax  sin f3y 

are  the  separable  solutions  we  used  to  solve  the  heat  equation  on  a  rectangle.  A  more 
interesting  case  is  to  choose 

x(t,x)  =  - — y - e~G~0  /(47p?  w(t,y)  =  - — j - /(47p?  (11.132) 

Zy/ngt  Zy/'K^t 

to  be  the  fundamental  solutions  (8.14)  to  the  one-dimensional  heat  equation  at  respec¬ 
tive  locations  x  =  £  and  y  —  r\.  Multiplying  these  two  solutions  together  produces  the 
fundamental  solution  for  the  two-dimensional  problem. 


Theorem  11.12.  The  fundamental  solution  to  the  heat  equation  ut  =  7  A u  corre¬ 
sponding  to  a  unit  delta  function  placed  at  position  (£,  rj )  E  IR2  at  the  initial  time  t0  =  0 
is 


F(t,  x,  y;  rj) 


1  e-[(x-02+(y~ri)2}/(iit)^ 

4:7Tgt 


(11.133) 


Proof :  Since  we  already  know  that  both  functions  (11.132)  are  solutions  to  the  one¬ 
dimensional  heat  equation,  Lemma  11.11  guarantees  that  their  product,  which  equals 
(11.133),  solves  the  two-dimensional  heat  equation  for  t  >  0.  Moreover,  at  the  initial 
time, 

u(0,  x ,  y)  =  i>(0,  x)  w{ 0,  y )  =  5(x  -  5(y  -  77) 


is  a  product  of  delta  functions,  and  hence  the  result  follows.  Indeed,  the  total  heat 


//»  /*oo  p  00 

/  u{t,x,y)  dx  dy  =  /  v(t,x)dx  /  w(t,y)dy  =  1 
/  J  —  oo  J  —oo 


t  >  0. 


remains  constant,  while 


r  u  \  f  °°>  (x,y)  =  {Cv), 

Inn  u(t,  x,  y  =  < 

£^o+  [  0,  otherwise, 

has  the  standard  delta  function  limit  at  the  initial  time  instant.  Q.E.D. 

Figure  11.8  depicts  the  evolution  of  the  fundamental  solution  when  7  =  1  at  the 
indicated  times.  Observe  that  the  initially  concentrated  temperature  spreads  out  in  a 
radially  symmetric  manner,  while  the  total  amount  of  heat  remains  constant.  At  any 
individual  point  (x,y)  7^  (0,0),  the  initially  zero  temperature  rises  slightly  at  first,  but 
then  decays  monotonically  back  to  zero  at  a  rate  proportional  to  1/t.  As  in  the  one¬ 
dimensional  case,  since  the  fundamental  solution  is  7  0  for  all  t  >  0,  the  heat  energy  has 
an  infinite  speed  of  propagation. 

Both  the  one-  and  two-dimensional  fundamental  solutions  have  bell-shaped  profiles 
known  as  Gaussian  filters.  The  most  important  difference  is  the  initial  factor.  In  a  one¬ 
dimensional  medium,  the  fundamental  solution  decays  in  proportion  to  1  j\ft,  whereas  in 
the  plane  the  decay  is  more  rapid,  being  proportional  to  1/t.  The  physical  explanation  is 
that  the  heat  energy  is  able  to  spread  out  in  two  independent  directions,  and  hence  diffuses 
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Figure  11.8. 


The  fundamental  solution  to  the  planar  heat  equation. 


away  from  its  initial  source  more  rapidly.  As  we  shall  see,  the  decay  in  three-dimensional 
space  is  more  rapid  still,  being  proportional  to  t-3/2  for  similar  reasons;  see  (12.120). 

The  principal  use  of  the  fundamental  solution  is  for  solving  the  general  initial  value 
problem.  We  express  the  initial  temperature  distribution  as  a  superposition  of  delta  func¬ 
tion  impulses, 

u(0,  x,  y)  =  f(x,  y)  =  JJ  /(£,  rj)  S(x  -€,y-r})d£  dr), 

where,  at  the  point  (£,  77)  £  M2,  the  impulse  has  magnitude  /(£,  77).  Linearity  implies  that 
the  solution  is  then  given  by  the  same  superposition  of  fundamental  solutions. 

Theorem  11.13.  The  solution  to  the  initial  value  problem 

ut=jAu,  u(0,x,y)  =  f(x,y),  (x,y)£R2, 

for  the  planar  heat  equation  is  given  by  the  linear  superposition  formula 

u(t,  x,  y)  =  -^h~t  J J  f(e,n)e~[<'x~i)2+('v~v)2]/('4'yt)  d^dr).  (11.134) 
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Figure  11.9.  Diffusion  of  a  disk.  [+J 


We  can  interpret  the  solution  formula  (11.134)  as  a  two-dimensional  convolution 


u(t,  x,  y)  =  F(t,  x,  y)  *  f(x,y)  (11.135) 

of  the  initial  data  with  a  one-parameter  family  of  progressively  wider  and  shorter  Gaussian 
filters 

F(t,x,y)  =  F(t,x,y;  0,0)  =  ~r~~  e_(a:2+y2)/(47t).  (11.136) 

47T7 1 

As  in  (7.54),  such  a  convolution  can  be  interpreted  as  a  Gaussian  weighted  averaging  of 
the  function  f(x,y),  which  has  the  effect  of  smoothing  out  the  initial  data. 

Example  11.14.  If  our  initial  temperature  distribution  is  constant  on  a  circular 
region,  say 


u(0,x,y)  = 


1  x2  +  y2  <  1, 

0,  otherwise, 

then  the  solution  can  be  evaluated  using  (11.134),  as  follows: 


u(t,x,y ) 


- l~fJJ  e_[(x_€)2+(l,_’?)2l/(4t)c itdr), 


where  the  integral  is  over  the  unit  disk  D  =  {£?  +  rj2  <  1}.  Unfortunately,  the  integral 
cannot  be  expressed  in  terms  of  elementary  functions.  On  the  other  hand,  numerical 
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evaluation  of  the  integral  is  straightforward.  A  plot  of  the  resulting  radially  symmetric 
solution  appears  in  Figure  11.9.  One  could  also  interpret  this  solution  as  the  diffusion  of 
an  animal  population  in  a  uniform  isotropic  environment  or  bacteria  in  a  similarly  uniform 
large  petri  dish  that  are  initially  confined  to  a  small  circular  region. 


Exercises 


11.5.1.  Solve  the  following  initial  value  problem:  ut  =  5 (uxx  +  u  ),  u(0,x,y)  =  e  +y  ' 

11.5.2.  Write  down  an  integral  formula  for  the  solution  to  the  following  initial  value  problem: 

ut  =  3 (uxx  +  uyy),  u{ 0,  x,  y)  =  (1  +  X2  +  y2)~2 . 

11.5.3.  At  the  initial  time  t  =  0,  a  unit  heat  source  is  instantaneously  applied  at  the  origin 
of  the  (x,  y)-plane.  For  t  >  0,  what  is  the  maximum  temperature  experienced  at  a  point 
(x,y)  7^  0?  At  what  time  is  the  maximum  temperature  achieved?  Does  the  temperature 
approach  an  equilibrium  value  as  t  — oo?  If  so,  how  fast? 


11.5.4.  (a)  Find  an  eigenfunction  series  representation  of  the  fundamental  solution  for  the  heat 
equation  ut  =  A u  on  the  unit  square  {0  <  x,y  <  1}  when  subject  to  homogeneous  Dirich- 
let  boundary  conditions,  (b)  Write  the  solution  to  the  initial  value  problem  u(0,x,y)  = 
f(x,y)  in  terms  of  the  fundamental  solution,  (c)  Discuss  how  your  formula  is  related  to  the 
Fourier  series  solution  (11.43). 


11.5.5.  Let  u(t,  x,  y)  be  a  solution  to  the  heat  equation  on  all  of  R  such  that  u  and  ||  Vu  1 1  — ^  0 
rapidly  as  ||  x  ||  oo.  (a)  Prove  that  the  total  heat  H(t)  =  J  J  u(t,x,y)  dx  dy  is  constant. 

(b)  Explain  how  this  can  be  reconciled  with  the  statement  that  u(t,x,y)  — 0  as  t  — oo  at 
all  points  (x,y)  G  R2. 


0  11.5.6.  Consider  the  initial  value  problem  ut  =  7  A u-\-H(t,x,y),  u(0,x,y)  =  0,  for  the  inhomo¬ 
geneous  heat  equation  on  the  entire  (x,  y) -plane,  where  H(t,x,y)  represents  a  time- varying 
external  heat  source.  Derive  an  integral  formula  for  its  solution.  Hint :  Mimic  the  solution 
method  in  Section  8.1. 


11.5.7.  A  flat  plate  of  infinite  extent  with  unit  thermal  diffusivity  starts  off  at  0°.  From  then 
on,  a  unit  heat  source  is  continually  applied  at  the  origin.  Find  the  resulting  temperature 
distribution.  Does  the  temperature  eventually  reach  a  steady  state? 

Hint :  Use  Exercise  11.5.6. 


C  11.5.8.  Building  on  Example  11.14,  we  model  the  “diffusion”  of  a  set  D  C  R  as  the  solution 
u(t,x,y)  to  the  heat  equation  ut  =  A u  subject  to  the  initial  condition  u(0,x,y)  = 

1,  (x,y)  e  D, 

0,  (x,y)  <£  D, 


where  XD(x,y)  = 


is  the  characteristic  function  of  the  set  D. 


(a)  Write  down  a  formula  for  the  diffusion  of  the  set  D. 

(b)  True  or  false:  At  each  £,  the  diffusion  u(t,x,y)  is  the  characteristic  function  of  a  set  Dt. 

(c)  Prove  that  0  <  u(t,x,y)  <  1  for  all  (x,y)  and  t  >  0.  (d)  What  is  lim  u(t,x,y)? 

t  — >•  00 

(e)  Write  down  a  formula  for  the  diffusion  of  a  unit  square  D  =  {0<x,y<l},  and  then 
plot  the  result  at  several  times.  Discuss  what  you  observe. 


11.5.9.  (a)  Explain  why  the  delta  function  on  R2  satisfies  the  scaling  law  S(x,y)  =  /3 2  5(/3x,/3y), 
for  /3  7^  0.  (b)  Verify  that  the  fundamental  solution  to  the  heat  equation  on  R2  obeys  the 

same  scaling  law:  F(t,  x ,  y)  =  /3 2  F(/32  £,  /3  x,  /3  y).  (c)  Is  the  fundamental  solution  a 

similarity  solution? 
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r\ 

11.5.10.  (a)  Find  the  fundamental  solution  on  R  to  the  cable  equation  ut  =  7  A u  —  au,  where 
a  >  0  is  constant,  (b)  Use  your  solution  to  write  down  a  formula  for  the  solution  to  the 

general  initial  value  problem  u(0,x,y)  =  f(x,y)  for  (x,y)  G  R2. 


11.5.11.  (a)  Prove  that  if  v(t,x)  and  w(t,x)  solve  the  dispersive  wave  equation  (8.90),  then 
their  product  u(t,x,y)  =  v(t,  x)  w(t,y)  solves  the  two-dimensional  dispersive  equation 


U-f.  +  Uxxx  +  Uyyy 


(b)  What  is  the  fundamental  solution  on  R2  of  the  latter  equation?  (c)  Write  down  an  in¬ 
tegral  formula  for  the  solution  to  the  initial  value  problem  u(0,x,y)  =  f(x,y)  for  (x,y)  G  R2. 


11.5.12.  Define  the  two-dimensional  convolution  f  *  g  of  functions  f(x,y)  and  g(x,y)  so  that 
equation  (11.135)  is  valid. 


11.6  The  Planar  Wave  Equation 

Let  us  next  consider  the  two-dimensional  wave  equation 

(11. 

which  models  the  unforced  transverse  vibrations  of  a  homogeneous  membrane,  e.g.,  a  drum. 
Here,  u(t,  x,  y)  represents  the  vertical  displacement  of  the  membrane  at  time  t  and  position 
(x,  y)  G  f 2,  where  the  domain  O  C  M2,  assumed  bounded,  represents  the  undeformed  shape. 
The  constant  c2  >0  encapsulates  the  membrane’s  physical  properties  —  density,  tension, 
stiffness,  etc.;  its  square  root,  c,  is  called,  as  in  the  one-dimensional  case,  the  wave  speed , 
since  it  represents  the  speed  of  propagation  of  localized  signals. 

Remark :  In  this  simplified  model,  we  are  only  allowing  small,  transverse  (vertical) 
displacements  of  the  membrane.  Large  elastic  vibrations  lead  to  the  nonlinear  partial 
differential  equations  of  elastodynamics,  [7].  In  particular,  the  bending  vibrations  of  a 
flexible  elastic  plate  are  governed  by  a  more  complicated  fourth-order  partial  differential 
equation. 

The  solution  u(£,  x,  y )  to  the  wave  equation  will  be  uniquely  specified  once  we  impose 
suitable  boundary  and  initial  conditions.  The  Dirichlet  conditions 

u(t,x,y)  =  h(x,y),  (x,y)  £  (11.138) 

correspond  to  gluing  our  membrane  to  a  fixed  boundary  —  a  rim;  more  generally,  we  can 
also  allow  h  to  depend  on  t,  modeling  a  membrane  attached  to  a  moving  boundary.  On 
the  other  hand,  the  homogeneous  Neumann  conditions 

c)v 

—  (t,x,y)=  0,  (x,  y)  G  dQ.  (11.139) 

represent  a  free  boundary  where  the  membrane  is  not  attached  to  any  support  —  although 
in  this  model,  its  edge  is  allowed  to  move  only  in  a  vertical  direction.  Mixed  boundary 
conditions  attach  part  of  the  boundary  and  leave  the  remaining  portion  free  to  vibrate: 

u  =  h  on  D  C  dfl,  7^=0  on  N  =  dQ\D.  (11.140) 

an 


d2 


u 


dt 2 


=  c2Au 


=  <?(£ 


u  d2u 
\  dx 2  dy 2 
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Since  the  wave  equation  is  of  second  order  in  time,  to  uniquely  specify  the  solution  we  need 
to  impose  two  initial  conditions, 

du 

u(0,x,y)  =  f(x,y),  —  (0,  x,  y)  =  g(x,  y),  (x,y)eQ.  (11.141) 

The  first  specifies  the  membrane’s  initial  displacement,  while  the  second  prescribes  its 
initial  velocity. 


Separation  of  Variables 

Unfortunately,  the  d’Alembert  solution  method  does  not  apply  to  the  two-dimensional 
wave  equation  in  any  obvious  manner.  The  reason  is  that,  unlike  the  one-dimensional 
version  (2.69),  one  cannot  factorize  the  planar  wave  operator  □  =  d2  —  c2  d2  —  c2  d2,  thus 
precluding  any  sort  of  reduction  to  a  first-order  partial  differential  equation.  However,  this 
is  not  the  end  of  the  story,  and  we  will  return  to  this  issue  at  the  end  of  Section  12.6. 

We  thus  fall  back  on  our  universal  solution  tool  for  linear  partial  differential  equations 
separation  of  variables.  According  to  the  general  framework  established  in  Section  9.5, 
the  separable  solutions  to  the  wave  equation  have  the  trigonometric  form 

uk(t,x,y)  =  cos(ojkt)vk{x,y)  and  uk(t,  x,  y)  =  sm(ukt)  vk(x,  y).  (11.142) 

Substituting  back  into  the  wave  equation,  we  find  that  vk(x,y)  must  be  an  eigenfunction 
of  the  associated  Helmholtz  boundary  value  problem 

^  \  v  =  o,  (11- 

whose  eigenvalue  \k  =  uok  equals  the  square  of  the  vibrational  frequency.  According  to 
Theorem  9.47,  on  a  bounded  domain,  there  is  an  infinite  number  of  such  normal  modes  with 
progressively  faster  vibrational  frequencies:  cjk  — oo  as  k  — oo.  In  addition,  in  the  positive 
semi-definite  case  —  which  occurs  under  homogeneous  Neumann  boundary  conditions 
there  is  a  single  constant  null  eigenfunction,  leading  to  the  additional  separable  solutions 

u0(t,x,y)  =  1  and  uQ(t,x,y)=t.  (11.144) 

The  first  represents  a  stationary  membrane  that  has  been  displaced  to  a  fixed  height,  while 
the  second  represents  a  membrane  that  is  moving  off  in  the  vertical  direction  with  constant 
unit  speed.  (Think  of  the  membrane  moving  in  outer  space  unaffected  by  any  external 
gravitational  force.)  As  in  Section  9.5,  the  general  solution  can  be  written  as  an  infinite 
series  in  the  eigensolutions  (11.142).  Unfortunately,  as  we  know,  the  Helmholtz  boundary 
value  problem  can  be  explicitly  solved  only  on  a  rather  restricted  class  of  domains.  Here 
we  will  content  ourselves  with  investigating  the  two  most  important  cases:  rectangular  and 
circular  membranes. 

Remark :  The  vibrational  frequencies  represent  the  tones  and  overtones  one  hears  when 
the  drum  membrane  vibrates.  An  interesting  question  is  whether  two  drums  of  different 
shapes  can  have  identical  sounds  —  the  exact  same  vibrational  frequencies.  Or,  more 
descriptively,  can  one  “hear”  the  shape  of  a  drum?  It  was  not  until  1992  that  the  answer 
was  shown  to  be  no,  but  for  quite  subtle  reasons.  See  [47]  for  a  discussion  and  some 
examples  of  differently  shaped  drums  that  have  the  same  vibrational  frequencies. 


c 


/  d2u  d2u 
\  dx2  +  dy 2 
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Vibration  of  a  Rectangular  Drum 


Let  us  first  consider  the  vibrations  of  a  membrane  in  the  shape  of  a  rectangle 

R  =  {0  <  x  <  a,  0  <  ?/  <  6 } , 

with  side  lengths  a  and  5,  whose  edges  are  fixed  to  the  (x,  y)-plane.  Thus,  we  seek  to  solve 
the  wave  equation 


utt  =  c2A  u  =  c*(uxx  +  u  ), 


0  <  x  <  a,  0  <  y  <  b, 


(11.145) 


subject  to  the  initial  and  boundary  conditions 

u{t,  0,  y)  =  u(t,  a,  y)  =  0  =  u(£,  x,  0)  =  u(t,  x ,  b ) 
u(0,  x,  y)  =  f(x,  y),  ut{ 0,  x,  y)  =  g(x,  y), 


0  <  x  <  a. 
0  <  y  <  b. 


(11.146) 


As  we  saw  in  Section  11.2,  the  eigenfunctions  and  eigenvalues  for  the  associated  Helmholtz 
equation  on  a  rectangle, 


c2(vxx  +vyy)  +  Af  =  0,  (x,y)  e  R, 

when  subject  to  the  homogeneous  Dirichlet  boundary  conditions 

u(0,  y)  =  u(a,  y)  —  0  =  v(x,  0)  =  v(x,  6),  0  <  x  <  a,  0  <  y  <  b. 

are 


(11.147) 


(11.148) 


mux  nny 
_  (x,  y)  =  sin -  sin 


m,n 


a 


b 


where 


\  _  2  2 
/^m,n  ^  C 


2  2 

m  n 

a 2  b2 


(11.149) 


with  m,  n  =  1,2,...  .  The  fundamental  frequencies  of  vibration  are  the  square  roots  of  the 
eigenvalues,  so 


^ m,n  \  ^m,n  ^  ^ 


rm  n 


a *  b2 


m,  n  =  1,  2. 


(11.150) 


The  frequencies  will  depend  upon  the  underlying  geometry  —  meaning  the  side  lengths  — 
of  the  rectangle,  as  well  as  the  wave  speed  c,  which,  in  turn,  is  a  function  of  the  membrane’s 
density  and  stiffness.  The  higher  the  wave  speed,  or  the  smaller  the  rectangle,  the  faster 
the  vibrations.  In  layman’s  terms,  (11.150)  quantifies  the  observation  that  smaller,  stiffer 
drums  made  of  less-dense  material  vibrate  faster. 

According  to  (11.142),  the  normal  modes  of  vibration  of  our  rectangle  are 


u 


m,n 
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m,n 


(t,  x,  y)  =  cos  7 r  c 


(t,  x,  y)  =  sin  i r  c 
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(11.151) 


The  general  solution  can  then  be  written  as  a  double  Fourier  series 


oo 


u(t,X,y)=  ^  [am,nUm,n(^X^y)  +  bm,n  V) 

m,n  =  1 


11.6  The  Planar  Wave  Equation 


489 


in  the  normal  modes.  The  coefficients  &mn,6mn  are  fixed  by  the  initial  displacement 
u(0,x,y)  =  f(x,y)  and  the  initial  velocity  ut(0,x,y)  =  g(x,y).  Indeed,  the  usual  orthogo¬ 
nality  relations  among  the  eigenfunctions  imply 


v 


a 


m,n  ’ 


/) 


V 


rn,n 


a  b 


a 


£,  x  .  rriTTX  .  niry 
j{x,y)  sin -  sin — - — ax  ay. 


a 


(11.152) 


b  = 

m,n 
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m,n  ’ 
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rrn.:n 
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7 r  c  \frri1  b 2  +  ri2  a2 


na 

g(x,  y)  sin 


rmrx  niry 

-  sin  — - —  ax  ay. 

a  b 


Since  the  fundamental  frequencies  are  not  rational  multiples  of  each  other,  the  general 
solution  is  a  genuinely  quasiperiodic  superposition  of  the  various  normal  modes. 

In  Figure  11.10  ,  we  plot  the  solution  resulting  from  the  initially  concentrated  displace¬ 
ment^ 

u(0,x,y)  =  f(x,y)  =  e“10°[(x--5)  +(^--5)  ] 


at  the  center  of  a  unit  square,  so  a  =  b  =  1,  with  unit  wave  speed,  c—  1.  Note  that,  unlike 
a  concentrated  displacement  of  a  one-dimensional  string,  which  remains  concentrated  at 
all  subsequent  times  and  periodically  repeats,  the  initial  displacement  here  spreads  out  in 
a  radially  symmetric  manner  and  propagates  to  the  edges  of  the  rectangle,  where  it  reflects 


^  The  alert  reader  may  object  that  the  initial  displacement  f(x,y)  does  not  exactly  satisfy 
the  Dirichlet  boundary  conditions  on  the  edges  of  the  rectangle.  But  this  does  not  prevent  the 
existence  of  a  well-defined  (weak)  solution  to  the  initial  value  problem,  whose  initial  boundary 
discontinuities  will  subsequently  propagate  into  the  square.  However,  here  these  are  so  tiny  as  to 
be  unnoticeable  in  the  solution  graphs. 
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and  then  interacts  with  itself.  Moreover,  due  to  the  quasiperiodicity  of  the  solution,  the 
drum’s  motion  never  exactly  repeats,  and  the  initially  concentrated  displacement  never 
quite  reforms. 


Vibration  of  a  Circular  Drum 


Let  us  next  analyze  the  vibrations  of  a  circular  membrane  of  unit  radius.  In  polar  coordi¬ 
nates,  the  planar  wave  equation  (11.137)  takes  the  form 


d2u 
dt 2 


d2u 
dr 2 


1  du 
r  dr 


+ 


1  d2u  \ 

r2  d62  J 


We  will  again  consider  the  homogeneous  Dirichlet  boundary  value  problem 


(11.153) 


u(t,  1,0)  =0,  t  >  0,  —7T<0<7T,  (11.154) 

along  with  initial  conditions 

du 

u(0,r,d)  =  f(r,d),  —  (0,  r,  0)  =  g(r,0),  (11.155) 

representing  the  initial  displacement  and  velocity  of  the  membrane.  As  always,  we  build  up 
the  general  solution  as  a  quasiperiodic  linear  combination  of  the  normal  modes  as  specified 
by  the  eigenfunctions  for  the  associated  Helmholtz  boundary  value  problem. 

As  we  saw  in  Section  11.2,  the  eigenfunctions  of  the  Helmholtz  equation  on  a  disk 
of  radius  1,  say,  subject  to  homogeneous  Dirichlet  boundary  conditions,  are  products  of 
trigonometric  and  Bessel  functions: 

vo,n(ri°)  =  Jo(Co  ,nr), 

vm,n(rie)  =  Jm((m,nr)  cosm0,  m,n  =  1,2,3,....  (11.156) 

=  Jm«m,nr)  Sm  m6», 

Here  r,  0  are  the  usual  polar  coordinates,  while  Cm  n  >  0  denotes  the  nth  (positive)  root 
of  the  mth  order  Bessel  function  Jm(z),  cf.  (11.118).  The  corresponding  eigenvalue  is  its 
square,  Am  n  =  Cm  m  and  hence  the  natural  frequencies  of  vibration  are  equal  to  the  Bessel 
roots  scaled  by  the  wave  speed: 


^ m,n  ^  \J ' ^rn,n  (11.157) 

A  table  of  their  values  (for  the  case  c—  1)  can  be  found  in  the  preceding  section.  The  Bessel 
roots  do  not  follow  any  easily  discernible  pattern,  and  are  not  rational  multiples  of  each 
other.  This  result,  known  as  Bourget’s  hypothesis ,  [119;  p.  484],  was  rigorously  proved  by 
the  German  mathematician  Carl  Ludwig  Siegel  in  1929,  [106].  Thus,  the  vibrations  of  a 
circular  drum  are  also  truly  quasiperiodic,  thereby  providing  a  mathematical  explanation 
of  why  drums  sound  dissonant. 

The  frequencies  n  =  cCo  n  correspond  to  simple  eigenvalues,  with  a  single  radially 
symmetric  eigenfunction  J0(Co  nr)?  while  the  “angular  modes”  cem  n,  for  m  >  0,  are  double, 
each  possessing  two  linearly  independent  eigenfunctions  (11.156).  According  to  the  general 


11.6  The  Planar  Wave  Equation 


491 


Figure  11.11. 
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formula  (11.142),  each  eigenfunction  engenders  two  independent  normal  modes  of  vibration, 
having  the  explicit  forms 


COs(cCotnt)  J0(Co,nr ), 

COS(CC m,„t)  Jm(C m,nr)  COS  m9, 

cos(cCm,V)  Jm(Cm,nr )  sinm(9, 
The  general  solution  to  (11.153-154) 


sin(CCo,V)  J0(C 0,nr), 

sin(cCm,nt)*/m(Cm,nr)  COS  m0’  (11.158) 

sin(CC,ni)  Jm(Cm,nr)  SlUmO. 

is  then  expressed  as  a  Fourier-Bessel  series: 


oo 


U 


(^,0)  =  -  5Z  [aO,nCOS(CCo,„^)  +  C0jnSin(cCoint)]  J0(Co,nr) 


n  =  1 


oo 


(11.159) 


+  [(  am,n  cos(c  Cm,n  ^)  +  cm,n  sin(cCm,„^))  COS  m6» 

m,n  =  1  .. 

+  (&m,nCOs(cCmjn^)  +dmi„sin(cCm>n^))sinm6»J  Jm(Cm>nr), 

are  determined,  as  usual,  by  the  initial  displace¬ 
ment  and  velocity  of  the  membrane  (11.155).  In  Figure  11.11,  the  vibrations  due  to  an 
initially  off-center  concentrated  displacement  are  displayed;  the  wave  speed  is  c—  1,  and  the 
time  interval  between  successive  plots  is  At  =  .3.  Again,  the  motion  is  only  quasiperiodic 
and,  no  matter  how  long  you  wait,  never  quite  returns  to  its  original  configuration. 


whose  coefficients  a  ,b  ,c  ,d 


Exercises 

11.6.1.  Use  your  physical  intuition  to  decide  whether  the  following  statements  are  true  or  false. 
Then  justify  your  answer. 

(a)  Increasing  the  stiffness  of  a  membrane  increases  the  wave  speed. 

(b)  Increasing  the  density  of  a  membrane  increases  the  wave  speed. 

(c)  Increasing  the  size  of  a  membrane  increases  the  wave  speed 

11.6.2.  Two  uniform  membranes  have  the  same  shape,  but  are  made  out  of  different  materials. 
Assuming  that  they  are  both  subject  to  the  same  homogeneous  boundary  conditions,  how 
are  their  vibrational  frequencies  related? 

11.6.3.  List  the  numerical  values  of  the  six  lowest  vibrational  frequencies  of  a  unit  square  with 
wave  speed  c  —  1  when  subject  to  homogeneous  Dirichlet  boundary  conditions.  How  many 
linearly  independent  normal  modes  are  associated  with  each  of  these  frequencies? 

T  11.6.4.  The  rectangular  membrane  R  =  {—1  <  x  <  1,  0  <  y  <  1}  has  its  two  short  sides 

attached  to  the  (x,  y)-plane,  while  its  long  sides  are  left  free.  The  membrane  is  initially 
displaced  so  that  its  right  half  is  one  unit  above,  while  its  left  half  is  one  unit  below  the 
plane,  and  then  released  with  zero  initial  velocity.  (This  discontinuous  initial  data  serves 
to  model  a  very  sharp  transition  region.)  Assume  that  the  physical  units  are  chosen  so  the 
wave  speed  c  —  1.  (a)  Write  down  an  initial-boundary  value  problem  that  governs  the  vi¬ 
brations  of  the  membrane,  (b)  What  are  the  fundamental  frequencies  of  vibration  of  the 
membrane?  (c)  Find  the  eigenfunction  series  solution  that  describes  the  subsequent  mo¬ 
tion  of  the  membrane,  (d)  Is  the  motion  (i)  periodic?  (ii)  quasiperiodic?  (in)  unstable? 
(iv)  chaotic?  Explain  your  answer. 

11.6.5.  Determine  the  solution  to  the  following  initial-boundary  value  problems  for  the  wave 
equation  on  the  rectangle  R  =  {  0  <  x  <  2,  0  <  y  <  1 }: 

Utt  =  Uxx  +  uyy,  u(t,  x ,  0)  =  u(t,  x ,  1)  =  u(t,  0,  y)  =  u(t,  2,  y)  =  0, 

u(0,  x,  y)  =  sin  ny ,  ut  (0,  x,  y)  =  sin  Try; 


(a) 
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(b) 


utt  =  uxx 


Utt  uxx 


(c) 


uyy ’ 

u(t,  x,  0)  = 

=  u(t ,  X. 

i)  = 

sin  7 vy, 

ut(0,x,y) 

=  sin7ry; 

Uyy, 

u(t,  X, 

0)  = 

f 

0 

< 

X  <  1, 

1 

1 

< 

x  <  2, 

1 1 

H 

cT 

HO 

3 

du 

dx 


(t,o,y)  = 


(d) 


—i—  2?/ 

“xx  1  yy 


u{0,x,y)  = 
utt  =  2u, 

u(0,x,y)=0,  ut(0  ,x,y) 

11.6.6.  True  or  false:  The  more  sides  of  a  rectangle  that  are  tied  down,  the  faster  it  vibrates. 


u(t,  x ,  0)  =  u(t,  x ,  1)  =  u(t,  0,  y)  =  u(t,  2,  y)  =  0, 

1,  0  <  x  <  1, 

0,  1  <  x  <  2. 


11.6.7.  Answer  Exercise  11.6.3  when  (a)  two  adjacent  sides  of  the  square  are  tied  down  and 
the  other  two  are  left  free;  (b)  two  opposite  sides  of  the  square  are  tied  down  and  the  other 
two  are  left  free;  (c)  the  membrane  is  freely  floating  in  outer  space. 

11.6.8.  A  square  drum  has  two  sides  fixed  to  a  support  and  two  sides  left  free.  Does  the  drum 
vibrate  faster  if  the  fixed  and  free  sides  are  adjacent  to  each  other  or  on  opposite  sides? 


11.6.9.  Write  down  a  periodic  solution  to  the  wave  equation  on  a  unit  square,  subject  to 
homogeneous  Dirichlet  boundary  conditions,  that  is  not  a  normal  mode.  Does  it  vibrate  at 
a  fundamental  frequency? 

11.6.10.  A  rectangular  drum  with  side  lengths  1  cm  by  2  cm  and  unit  wave  speed  c  =  1  has  its 
boundary  fixed  to  the  (x,  y)-plane  while  subject  to  a  periodic  external  forcing  of  the  form 
F(t,x,y)  =  cos(ceT)  h(x,y).  (a)  At  which  frequencies  cv  will  the  forcing  incite  resonance 

in  the  drum?  (b)  If  cv  is  a  resonant  frequency,  write  down  the  condition(s)  on  h(x,y)  that 
ensure  excitation  of  a  resonant  mode. 


11.6.11.  The  right  half  of  a  rectangle  of  side  lengths  1  by  2  is  initially  displaced,  while  the  left 
half  is  quiescent.  True  or  false:  The  ensuing  vibrations  are  restricted  to  the  right  half  of 
the  membrane. 

T  11.6.12.  A  torus  (inner  tube)  can  be  obtained  by  gluing  together  each  of  the  two  pairs  of 

opposite  sides  of  a  rubber  rectangle.  The  (small)  vibrations  of  the  torus  are  described  by 
the  following  periodic  initial-boundary  value  problem  for  the  wave  equation,  in  which  x,y 
represent  angular  variables: 

utt  =  ™  =  c2(uxx  +uyy)’  u(0,x,y)  =  f(x,y),  ut(0,  x,y)  =  g(x,y), 

u(t,-w,y)  =u{t,w,y),  ux{t, -n,y)  =  ux(t,w,y),  -ir<x<ir, 

u(t,  X,  —  7r)  =  u(t,  X,  7r),  U  (t,  X,  —  7r)  =  U  (t,  X,  7f),  —  7T  <  y  <  7T. 

(a)  Find  the  fundamental  frequencies  and  normal  modes  of  vibration,  (b)  Write  down  a 
series  for  the  solution,  (c)  Discuss  the  stability  of  a  vibrating  torus.  Is  the  motion 
(i)  periodic;  (ii)  quasiperiodic;  (in)  chaotic;  (iv)  none  of  these? 

11.6.13.  The  forced  wave  equation  utt  =  c2Au  +  F(x,y)  on  a  bounded  domain  Q  C  M2 
models  a  membrane  subject  to  a  constant  external  forcing  function  F(x,y).  Write  down 
an  eigenfunction  series  solution  to  the  forced  wave  equation  when  the  membrane  is  subject 
to  homogeneous  Dirichlet  boundary  conditions  and  initial  conditions  u(0,x,y)  =  /(x,y), 
ut(0,x,y)  =  g(x,y).  Hint:  Expand  the  forcing  function  in  an  eigenfunction  series. 

11.6.14.  A  circular  drum  of  radius  Co  i  ~  2.4048  has  initial  displacement  and  velocity 

u(0,x,y)  =  0,  -fj-(0,x,y)  =  2  J0(<jx2  +  y2  ) . 

Assuming  that  the  circular  edge  of  the  drum  is  fixed  to  the  (x,  y)-plane,  describe,  both 
qualitatively  and  quantitatively,  its  subsequent  motion. 

11.6.15.  Write  out  the  integral  formulae  for  the  coefficients  in  the  Fourier-Bessel  series  solution 
(11.159)  to  the  wave  equation  in  a  circular  disk  in  terms  of  the  initial  data 

u(0,  r,  0)  =  f(r,  0),  ut( 0,  r,  0)  =  g(r,  0). 
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11.6.16.  A  circular  drum  at  rest  is  struck  with  a  concentrated  blow  at  its  center.  Write  down 
an  eigenfunction  series  describing  the  resulting  vibration. 

T*  11.6.17.  (a)  Set  up  and  solve  the  initial-boundary  value  problem  for  the  vibrations  of  a  uniform 
circular  drum  of  unit  radius  that  is  freely  floating  in  space,  (b)  Discuss  the  stability  of  the 
drum’s  motion,  (c)  Are  the  vibrations  slower  or  faster  than  when  its  edges  are  fixed  to  a 
plane? 

11.6.18.  A  flat  quarter-disk  of  radius  1  has  its  circular  edge  and  one  of  its  straight  edges 
attached  to  the  (cc,  y)-plane,  while  the  other  straight  edge  is  left  free.  At  time  t  =  0  the 
disk  is  struck  with  a  hammer  (unit  delta  function)  at  its  midpoint,  i.e.,  at  radius  \  and 
halfway  between  the  straight  edges,  (a)  Write  down  an  initial-boundary  value  problem  for 
the  subsequent  vibrations  of  the  quarter-disk.  Hint :  Be  careful  with  the  form  of  the  delta 
function  in  polar  coordinates;  see  Exercise  6.3.6.  (b)  Assuming  that  the  physical  units  are 
chosen  so  that  the  wave  speed  c  —  1,  determine  the  quarter-disk’s  vibrational  frequencies, 
(c)  Write  down  an  eigenfunction  series  solution  for  the  subsequent  motion,  (d)  Is  the 
motion  unstable?  periodic?  If  so,  what  is  the  period? 

11.6.19.  True  or  false:  Assuming  homogeneous  Dirichlet  boundary  conditions,  the  fundamen¬ 
tal  frequencies  of  a  vibrating  half-disk  are  exactly  twice  those  of  the  full  disk  of  the  same 
radius. 

T  11.6.20.  The  edge  of  a  circular  drum  is  moved  periodically  up  and  down,  so  u(t ,  1,0)  =  coscat. 
Assuming  that  the  drum  is  initially  at  rest,  discuss  its  response. 

11.6.21.  A  drum  is  in  the  shape  of  a  circular  annulus  with  outer  radius  1  meter  and  inner 
radius  .5  meter.  Find  numerical  values  for  its  first  three  fundamental  vibrational  frequen¬ 
cies. 


T  11.6.22.  A  homogeneous  rope  of  length  1  and  weight  1  is  suspended  from  the  ceiling.  Taking  x 
as  the  vertical  coordinate,  with  x  =  1  representing  the  fixed  end  and  x  =  0  the  free  end,  the 
planar  displacement  u{t ,  x)  of  the  rope  satisfies  the  initial-boundary  value  problem 

l«M)|  <  oo,  u(t,  1)  =  0, 

)  du 

u{0,x)  =  f(x),  —(0,x)=g(x), 


d2u  d 
dt 2  dx 


x 


du 

dx 


t  >  0,  0  <  x  <  1. 


(a)  Find  the  solution.  Hint:  Let  y  =  y/x .  (b)  Are  the  vibrations  periodic  or  quasiperiodic? 
(c)  Describe  the  behavior  of  the  rope  when  subject  to  uniform  periodic  external  forcing 

F(t,  x)  =  a  cos  cat. 


Scaling  and  Symmetry 


Symmetry  methods  can  also  be  effectively  employed  in  the  analysis  of  the  wave  equa¬ 
tion.  Let  us  consider  the  simultaneous  rescaling 


t  i — >  at ,  x  i — >  /3x,  y  \ — >  f3y ,  (11.160) 

of  time  and  space,  whose  effect  is  to  change  the  function  u(t ,  x,  y)  into  a  rescaled  version 


U (£,  x ,  y)  =  u{a  t,  /?  x,  f3  y). 

The  chain  rule  is  employed  to  relate  their  derivatives: 


(11.161) 


d2U 


d2 


u 


d2U 


a? 


dt 2  dt 2  ’  dx 2 

Therefore,  if  u  satisfies  the  wave  equation 


4 


d2 


U 


dx2 


d2U 

dy2 


d2 


u 


dy- 


utt  =  c2  A u. 
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then  U  satisfies  the  rescaled  wave  equation 


B2 


au  =  c2au, 


where  the  rescaled  wave  speed  is  C 


a  c 

T' 


(11.162) 


In  particular,  rescaling  only  time  by  setting  a  =  1/c,  /3  =  1,  results  in  a  unit  wave  speed 
(7  =  1.  In  other  words,  we  are  free  to  choose  our  unit  of  time  measurement  so  as  to  fix  the 
wave  speed  equal  to  1. 

If  we  set  a  =  /3,  scaling  time  and  space  in  the  same  proportion,  then  the  wave  speed 
does  not  change,  C  =  c,  and  so 


t  i — >  (3t ,  x  i — >  fix,  y  i — >  (3y ,  (11.163) 

defines  a  symmetry  transformation  for  the  wave  equation:  If  u(t,x,y)  is  any  solution  to 
the  wave  equation,  then  so  is  its  rescaled  version 


[/(£,  x,  y)  =  u(/3t,  (3x,  (3y)  (11.164) 

for  any  choice  of  scale  parameter  /3  ^  0.  Observe  that  if  u(t,x,  y)  is  defined  on  a  domain 
ft,  then  the  rescaled  solution  U(t,x,y)  will  be  defined  on  the  rescaled  domain 


ft 


x  y_\ 

P'P) 


(x,y)  g  ft 


{  (x,y)  |  {/3x,  j3y)  G  ft  }  . 


(11.165) 


For  instance,  setting  the  scaling  parameter  /3  =  2  halves  the  size  of  the  domain.  The 
normal  modes  for  the  rescaled  domain  have  the  form 


Un(t,x,y )  =  un(pt,f3x,f3y)  =  cos (fiunt)  vn(/3x,/3y), 

Un(t,x,y)  =  un(Pt,/3x,/3y)  =  sin (/3unt)  vn(/3x,/3y), 

and  hence  the  rescaled  vibrational  frequencies  are  Vtn  =  /3  c un.  Thus,  when  (3  <  1,  the 
rescaled  membrane  is  larger  by  a  factor  1//3,  and  its  vibrations  are  slowed  down  by  the 
reciprocal  factor  /3.  For  instance,  a  drum  that  is  twice  as  large  will  vibrate  twice  as  slowly, 
and  hence  have  an  octave  lower  overall  tone.  Musically,  this  means  that  all  drums  of  a 
similar  shape  have  the  same  pattern  of  overtones,  differing  only  in  their  overall  pitch,  which 
is  a  function  of  their  size,  tautness,  and  density. 

In  particular,  choosing  /3  =  1  / R  will  rescale  the  unit  disk  into  a  disk  of  radius  R.  The 
fundamental  frequencies  of  the  rescaled  disk  are 

^  Cm,n>  (11.166) 

where  c  is  the  wave  speed  and  n  are  the  Bessel  roots,  defined  in  (11.118).  Observe  that 
the  ratios  camn/cem,  n,  between  vibrational  frequencies  remain  the  same,  independent  of 
the  size  of  the  disk  R  and  the  wave  speed  c.  We  define  the  relative  vibrational  frequencies 


(jlJ 


P 


m,n 


m,n 


UJ 


0,1 


c 

c 


m,n 


0,1 


in  proportion  to 


(11.167) 


which  is  the  drum’s  dominant,  or  lowest,  vibrational  frequency.  The  relative  frequencies 
n  are  independent  of  the  size,  stiffness  or  composition  of  the  drum  membrane.  In  the 
following  table,  we  display  a  list  of  all  relative  vibrational  frequencies  (11.167)  that  are  <  6. 
Once  the  lowest  frequency  oj01  has  been  determined  —  either  theoretically,  numerically, 
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or  experimentally  —  all  the  higher  overtones  cam  n  =  n  ce0  x  are  simply  obtained  by 
rescaling. 


Relative  Vibrational  Frequencies  of  a  Circular  Disk 


Exercises 

11.6.23.  True  or  false:  Two  rectangular  membranes,  made  out  of  the  same  material  and  both 
subject  to  Dirichlet  boundary  conditions,  have  the  same  relative  vibrational  frequencies  if 
and  only  if  they  are  have  similar  shapes. 

11.6.24.  True  or  false:  (a)  The  vibrational  frequencies  of  a  square  with  side  lengths  a  =  b  =  2 
are  four  times  as  slow  as  those  of  a  square  with  side  lengths  a  =  b  =  1. 

(b)  The  vibrational  frequencies  of  a  rectangle  with  side  lengths  a  =  2,  b  =  1,  are  twice  as 
slow  as  those  of  a  square  with  side  lengths  a  =  b  =  1. 

11.6.25.  A  vibrating  rectangle  of  unknown  size  has  wave  speed  c  —  1  and  is  subject  to  homoge¬ 
neous  Dirichlet  boundary  conditions.  How  many  of  its  lowest  vibrational  frequencies  do  you 
need  to  know  in  order  to  determine  the  size  of  the  rectangle? 

11.6.26.  Answer  Exercise  11.6.25  when  the  rectangle  is  subject  to  homogeneous  Neumann 
boundary  conditions. 

£  11.6.27.  A  circular  drum  has  the  A  above  middle  C,  which  has  a  frequency  of  440  Hertz,  as  its 
lowest  tone.  What  notes  are  the  first  five  overtones  nearest?  Try  playing  these  on  a  piano 
or  guitar.  Or,  if  you  have  a  synthesizer,  try  assembling  notes  of  these  frequencies  to  see  how 
closely  it  reproduces  the  dissonant  sound  of  a  drum. 

11.6.28.  In  an  orchestra,  a  circular  snare  drum  of  radius  1  foot  sits  near  a  second  circular  drum 
made  out  of  the  same  material.  Vibrations  of  the  first  drum  are  observed  to  excite  an  unde¬ 
sired  resonant  vibration  in  its  partner.  What  are  the  possible  radii  of  the  second  drum? 

11.6.29.  True  or  false:  The  relative  vibrational  frequencies  of  a  half-disk,  subject  to  Dirichlet 
boundary  conditions,  are  a  subset  of  the  relative  vibrational  frequencies  of  a  full  disk. 

11.6.30.  True  or  false:  If  u(t,x,y)  =  cos(cat)  v(x,  y)  is  a  normal  mode  of  vibration  for  a  unit 
square  subject  to  homogeneous  Dirichlet  boundary  conditions,  then  the  function  u(t,  x,  y)  = 

cos(c ut)  is  a  normal  mode  of  vibration  for  a  2  x  3  rectangle  that  is  subject  to  the 

same  boundary  conditions,  but  with  a  possibly  different  wave  speed.  If  true,  how  are  the 
wave  speeds  of  the  two  rectangles  related? 

11.6.31.  Prove  that  if  u(t,x,y)  is  a  solution  to  the  two-dimensional  wave  equation,  so  is  the 
translated  function  U (£,  x,  y)  =  u(t  —  t0,  x  —  xQ,  y  —  y0),  f°r  any  constants  t0,  x0,  y0. 
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0  11.6.32.  (a)  Prove  that  if  u(t,x,y)  solves  the  wave  equation,  so  does  U(t,x,y)  =  u(—t,x,y). 
Thus,  unlike  the  heat  equation,  the  wave  equation  is  time-reversible,  and  its  solutions  can 
be  unambiguously  followed  backwards  in  time,  (b)  Suppose  u(t,x,y)  solves  the  initial  value 
problem  (11.141).  Write  down  the  initial  value  problem  satisfied  by  U(t,x,y). 

11.6.33.  (a)  Prove  that,  on  R2,  the  solution  to  the  pure  displacement  initial  value  problem 
utt  =  c  Au,  a(0,  x,  y)  =  f(x,  y),  ut(0,x,y)  =  0,  is  an  even  function  of  t. 

(b)  Prove  that  the  solution  to  the  pure  velocity  initial  value  problem  utt  =  c2  Au, 
a(0,  x ,  y)  =  0,  ut{ 0,  x,  y)  =  g{x,  y),  is  an  odd  function  of  t. 

Hint :  Use  Exercise  11.6.32  and  uniqueness  of  solutions  to  the  initial  value  problem. 

11.6.34.  Suppose  v{t,x)  is  any  solution  to  the  one-dimensional  wave  equation  vtt  =  vxx.  Prove 
that  u(t,x,y)  =  v(t,ax  +  by),  for  any  constants  (a,  b)  ^  (0,0),  solves  the  two-dimensional 

wave  equation  utt  —  c2(uxx  +  u  )  for  some  choice  of  wave  speed.  Describe  the  behavior  of 
such  solutions. 


11.6.35.  A  traveling-wave  solution  to  the  two-dimensional  wave  equation  has  the  form 

u(t,x,y)  =  v{x  —  at,y  —  at),  where  a  is  a  constant.  Find  the  partial  differential  equation 
satisfied  by  the  function  v(£,rj).  Is  the  equation  hyperbolic? 


11.6.36.  Is  the  counterpart  of  Lemma  11.11  valid  for  the  wave  equation?  In  other  words,  if 
v(t,x)  and  w(t,x)  are  any  two  solutions  to  the  one-dimensional  wave  equation,  is  their 
product  u(t,  x ,  y)  =  v(t,  x)  w(t ,  y)  a  solution  to  the  two-dimensional  wave  equation? 


11.6.37.  (a)  How  would  you  solve  an  initial-boundary  value  problem  for  the  wave  equation  on  a 
rectangle  that  is  not  aligned  with  the  coordinate  axes?  (b)  Apply  your  method  to  set  up 


and  solve  an  initial-boundary  value  problem  on  the  square  R  =  {\x-\-y\<l, 


x 


y I  <  !}• 


Chladni  Figures  and  Nodal  Curves 


When  a  membrane  vibrates,  its  individual  atoms  typically  move  up  and  down  in  a  quasiperi- 
odic  manner.  As  such,  there  is  little  correlation  between  their  motions  at  different  locations. 
However,  if  the  membrane  is  set  to  vibrate  in  a  pure  eigenmode,  say 


un{t,x,y)  =  cos(uJnt)vn(x,y),  (11.168) 

then  all  points  move  up  and  down  at  a  common  frequency  lou  =  \/\,  ,  which  is  the  square 
root  of  the  eigenvalue  corresponding  to  the  eigenfunction  vn(x,y).  The  exceptions  are  the 
points  where  the  eigenfunction  vanishes: 


Vn(X’V)  =  0. 


(11.169) 


which  remain  stationary.  The  set  of  all  points  (x,y)  E  Q  that  satisfy  (11.169)  is  known  as 
the  nth  Chladni  figure  of  the  domain  12,  named  in  honor  of  the  eighteenth-century  German 
physicist  and  musician  Ernst  Chladni  who  first  observed  them  experimentally  by  exciting  a 
metal  plate  with  his  violin  bow,  [43].  The  mathematical  models  governing  such  vibrating 
plates  were  formulated  by  the  French  mathematician  Sophie  Germain  in  the  early  1800s. 
It  can  be  shown  that,  in  general,  each  Chladni  figure  consists  of  a  finite  system  of  nodal 


curves, 


34,  43],  that  partition  the  membrane  into  disjoint  nodal  regions.  As  the  membrane 
vibrates,  the  nodal  curves  remain  stationary,  while  each  nodal  region  is  entirely  either 
above  or  below  the  equilibrium  plane,  except,  momentarily,  when  the  entire  membrane 
has  zero  displacement.  As  Chladni  discovered  in  his  original  experiments,  scattering  small 


498 


11  Dynamics  of  Planar  Media 


2.136 


3.155 


3.500 


Figure  11.12.  Nodal  curves  and  relative  vibrational 
frequencies  of  a  circular  membrane. 


particles  (e.g.,  fine  sand)  over  a  membrane  or  plate  vibrating  in  an  eigenmode  will  enable 
ns  to  visualize  the  Chladni  figure,  because  the  particles  will  tend  to  accumulate  along  the 
stationary  nodal  curves.  Adjacent  nodal  regions,  lying  on  the  opposite  sides  of  a  nodal 
curve,  move  in  opposing  directions  —  when  one  is  up,  its  neighbors  are  down,  and  then 
they  switch  roles  as  the  membrane  becomes  momentarily  flat.  Let  us  look  at  a  couple  of 
examples  where  the  Chladni  figures  can  be  readily  determined. 
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Example  11.15.  Circular  Drums.  Since  the  eigenfunctions  (11.156)  for  a  disk  are 
products  of  trigonometric  functions  in  the  angular  variable  and  Bessel  functions  of  the 
radius,  the  nodal  curves  for  the  normal  modes  of  vibrations  of  a  circular  membrane  are 
rays  emanating  from  and  circles  centered  at  the  origin.  Consequently,  the  nodal  regions 
are  annular  sectors.  Chladni  figures  associated  with  the  first  nine  normal  modes,  indexed 
by  their  relative  frequencies,  are  plotted  in  Figure  11.12.  Representative  displacements  of 
the  membrane  in  each  of  the  first  twelve  modes  can  be  found  earlier,  in  Figure  11.6.  The 
dominant  (lowest  frequency)  mode  is  the  only  one  that  has  no  nodal  curves;  it  has  the 
form  of  a  radially  symmetric  bump  where  the  entire  membrane  flexes  up  and  down.  The 
next  lowest  modes  vibrate  proportionally  faster  at  a  relative  frequency  Pn  ~  1.593.  The 
most  general  solution  with  this  vibrational  frequency  is  a  linear  combination  of  the  two 
eigensolutions:  au1  1  +  (3 u1  v  Each  such  combination  has  a  single  diameter  as  a  nodal 
curve,  whose  angle  with  the  horizontal  depends  on  the  ratio  /3/a.  The  two  semicircular 
halves  of  the  drum  vibrate  in  opposing  directions  —  when  the  top  half  is  up,  the  bottom 
half  is  down  and  vice  versa.  The  next  set  of  modes  have  two  perpendicular  diameters  as 
nodal  curves;  the  four  quadrants  of  the  drum  vibrate  in  tandem,  with  opposite  quadrants 
moving  in  the  same  direction.  Next  in  increasing  order  of  vibrational  frequency  is  a  single 
mode,  which  has  a  circular  nodal  curve  whose  (relative)  radius  equals  the  ratio  of  the 
first  two  roots  of  the  order  zero  Bessel  function,  Co  2 /Co  l  ~  .43565;  see  Exercise  11.6.39 
for  a  justification.  In  this  case,  the  inner  disk  and  the  outer  annulus  vibrate  in  opposing 
directions.  And  so  on  . . .  . 


Example  11.16.  Rectangular  Drums.  For  most  rectangular  drums,  the  Chladni  fig¬ 
ures  are  relatively  uninteresting.  Since  the  normal  modes  (11.151)  are  separable  products 
of  trigonometric  functions  in  the  coordinate  variables  x,  y,  the  nodal  curves  are  equally 
spaced  straight  lines  parallel  to  the  sides  of  the  rectangle.  The  internodal  regions  are 
smaller  rectangles,  of  identical  size  and  shape,  with  adjacent  rectangles  vibrating  in  oppo¬ 
site  directions. 

More  interesting  figures  appear  when  the  rectangle  admits  multiple  eigenvalues  —  so- 
called  accidental  degeneracies.  Note  that  two  of  the  eigenvalues  (11.149)  coincide,  Am  n  = 
Xk  ll  if  and  only  if 


2  2 

m  n 

a 2  b2 


k 2  l2 
^  +  R2  ’ 


(11.170) 


where  (m,  n)  ^  (fc,  l)  are  distinct  pairs  of  positive  integers.  In  such  situations,  the  two 
eigenmodes  happen  to  vibrate  with  a  common  frequency  ca  =  n  =  uokl.  Consequently, 
any  linear  combination  of  the  eigenmodes,  e.g., 


.  mirx  niry  k  rrx  l7ry\ 

cos (uot)  a  sm -  sm  — - b  p  sm -  sm  — —  , 

a  b  a  b  / 


cp  (3  £  M, 


is  also  a  pure  vibration,  and  hence  qualifies  as  a  normal  mode.  The  associated  nodal  curves. 


m  nx  nny  „  k  itx  Iny 

a  sm -  sm  — -  +  /3  sm -  sm  — —  =  0, 


a 


a 


0  <  x  <  a. 

0  <  y  <  5, 


(11.171) 


have  a  more  intriguing  geometry,  which  can  change  dramatically  as  the  coefficients  a,  (3 
vary. 

For  example,  on  the  unit  square  i?={0<x,y<l},an  accidental  degeneracy  occurs 
whenever 

2  i  2  7  2  i  72 

to  +  n  —  k  +  l 


(11.172) 
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Some  Chladni  figures  for  a  square  membrane. 


for  distinct  pairs  of  positive  integers  (m,n)  7^  (fc,Z).  The  simplest  possibility  arises  when¬ 
ever  m  ^  n,  in  which  case  we  can  merely  reverse  the  order,  setting  k  =  n,  l  =  m.  In 
Figure  11.13  we  plot  three  sample  nodal  curves 

asinAnx  sin ny  +  sin nx  sin47r?/  =  0, 

corresponding  to  three  different  linear  combinations  of  the  eigenfunctions  with  m  —  l  —  4, 
n  —  k  —  1.  The  associated  vibrational  frequency  is,  in  all  cases,  cj41  =  c\/YJ  1 r,  where  c  is 
the  wave  speed. 

Classifying  accidental  degeneracies  of  rectangles  takes  ns  into  the  realm  of  number 
theory,  [9,29].  In  the  case  of  a  square,  equation  (11.172)  is  asking  us  to  locate  all  integer 
points  (m,  n)  E  Z2  that  he  on  a  common  circle. 

Remark :  Bonrget’s  hypothesis,  mentioned  after  (11.157),  implies  that  n  7^  (k  t 
whenever  (m,  n)  7^  (fc,Z).  This  implies  that  a  disk  has  no  accidental  degeneracies,  and 
hence  all  its  nodal  curves  are  concentric  circles  and  diameters. 


Exercises 

<0>  11.6.38.  Suppose  that  a  membrane  is  vibrating  in  a  normal  mode.  Prove  that  the  membrane 
lies  instantaneously  completely  flat  at  regular  time  intervals. 

0  11.6.39.  For  a  vibrating  disk  of  unit  radius,  determine  the  radius  of  the  circular  nodal  curve  for 
the  next-to- lowest  circular  mode. 

11.6.40.  Order  the  five  nodal  circles  displayed  in  Figure  11.12  according  to  their  size. 

11.6.41.  Sketch  the  Chladni  figures  in  a  unit  disk  corresponding  to  the  following  vibrational 
frequencies.  Determine  numerical  values  for  the  radii  of  any  circular  nodal  curves. 

(a)  w4  0,  (b)  w4  2,  (c)  ui2A,  (d)  w3  3,  (e)  a)15. 

11.6.42.  True  or  false:  Any  diameter  of  a  circular  disk  is  a  nodal  curve  for  some  normal  mode. 

11.6.43.  True  or  false:  The  nodal  curves  on  a  semicircular  disk  are  all  semicircles  and  rays 
emanating  from  the  center. 
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11.6.44.  (a)  Find  the  smallest  distinct  pair  of  positive  integers  (k,l)  ^  (m,  n)  satisfying  (11.172) 
that  are  not  obtained  by  simply  reversing  the  order,  i.e.,  (k,  l)  ^  (n,m).  (b)  Find  the 

next-smallest  example,  (c)  Plot  two  or  three  Chladni  figures  arising  from  such  degenerate 
eigenfunctions. 

T  11.6.45.  Let  R  be  a  rectangle  all  of  whose  sides  are  fixed  to  the  (x,  y) -plane.  Suppose  that  all 
its  nodal  curves  are  straight  lines.  What  can  you  say  about  its  side  lengths  a,  b ? 

11.6.46.  True  or  false:  The  nodal  regions  of  a  vibrating  rectangle  are  similarly  shaped 
rectangles. 

0  11.6.47.  Prove  that  any  point  of  intersection  (x0,y0)  of  two  nodal  curves  associated  with  the 
same  normal  mode  is  a  critical  point  of  the  associated  eigenfunction:  Vn(x0,y0)  =  0. 

11.6.48.  True  or  false:  The  nodal  curves  on  a  domain  do  not  depend  on  the  choice  of  boundary 
conditions. 


Chapter  12 

Partial  Differential  Equations  in  Space 


At  last  we  have  ascended  to  the  ultimate  rung  of  the  dimensional  ladder  (at  least  for  those 
of  us  living  in  a  three-dimensional  universe):  partial  differential  equations  in  physical  space. 
As  in  the  one-  and  two-dimensional  settings  developed  in  the  preceding  chapters,  the  main 
protagonists  are  the  Laplace  and  Poisson  equations,  modeling  equilibrium  configurations  of 
solid  bodies;  the  three-dimensional  wave  equation,  governing  vibrations  of  solids,  liquids, 
and  electromagnetic  waves;  and  the  three-dimensional  heat  equation,  modeling  spatial 
diffusion  processes.  To  conclude  this  chapter  —  and  the  book  —  we  will  also  analyze  the 
particular  three-dimensional  Schrodinger  equation  that  governs  the  hydrogen  atom,  and 
thereby  characterizes  atomic  orbitals. 

Fortunately,  almost  everything  of  importance  has  already  appeared  in  the  previous 
chapters,  and  appending  a  third  dimension  is,  for  the  most  part,  simply  a  matter  of  ap¬ 
propriately  adapting  the  constructions.  We  have  already  developed  the  principal  solu¬ 
tion  techniques:  separation  of  variables,  Green’s  functions,  and  fundamental  solutions.  In 
three-dimensional  problems,  separation  of  variables  is  applicable  in  a  variety  of  coordinate 
systems,  including  the  usual  rectangular,  cylindrical,  and  spherical  coordinates.  The  first 
two  do  not  lead  to  anything  fundamentally  new,  and  are  therefore  relegated  to  the  exer¬ 
cises.  Separation  in  spherical  coordinates  requires  spherical  Bessel  functions  and  spherical 
harmonics,  which  play  essential  roles  in  a  wide  variety  of  physical  systems,  both  classical 
and  quantum. 

The  Green’s  function  for  the  three-dimensional  Poisson  equation  in  space  can  be  iden¬ 
tified  as  the  classic  Newton  (Coulomb)  1  jr  gravitational  (electrostatic)  potential.  The 
fundamental  solution  for  the  three-dimensional  heat  equation  can  be  easily  guessed  from 
its  one-  and  two-dimensional  forms.  The  three-dimensional  wave  equation,  surprisingly, 
has  an  explicit  solution  formula,  named  after  Kirchhoff,  of  electrical  fame,  but  originally 
due  to  Poisson.  Counterintuitively,  the  best  way  to  handle  the  two-dimensional  wave  equa¬ 
tion  is  by  “descending”  from  the  simpler(!)  three-dimensional  Kirchhoff  formula.  Descent 
reveals  a  remarkable  difference  between  waves  in  planar  and  spatial  media.  Huygens’  Prin¬ 
ciple  states  that  three-dimensional  waves  emanating  from  a  localized  initial  disturbance 
remain  localized  as  they  propagate  through  space.  In  contrast,  initially  concentrated  two- 
dimensional  disturbances  leave  a  slowly  decaying  remnant  that  never  entirely  disappears. 

The  final  section  is  concerned  with  the  Schrodinger  equation  for  a  hydrogen  atom, 
that  is,  the  quantum-dynamical  system  governing  the  spatial  motion  of  a  single  electron 
around  a  positively  charged  nucleus.  As  we  will  see,  the  spherical  harmonic  eigensolutions 
account  for  the  observed  quantum  energy  levels  of  atoms  that  underly  the  periodic  table 
and  hence  the  foundations  of  molecular  chemistry. 
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12.1  The  Three-Dimensional  Laplace  and  Poisson  Equations 


We  begin  our  investigations,  as  usual,  with  systems  in  equilibrium,  deferring  dynamics 
until  later.  The  prototypical  equilibrium  system  is  the  three-dimensional  Laplace  equation 


A  u 


d2u  d2u  d2u 

dx 2  dy 2  +  dz 2  ’ 


(!2.1) 


T  q 

in  which  x  =  (x,y,z)  represents  rectangular  coordinates  on  ML  The  solutions  u(x,  y ,  z) 
continue  to  be  known  as  harmonic  functions.  The  Laplace  equation  models  unforced 
equilibria;  Poisson’s  equation  is  the  inhomogeneous  version 


-A  u  =  f(x,y,z),  (12.2) 

whose  right-hand  side  represents  some  form  of  external  forcing. 

The  basic  boundary  value  problem  for  the  Laplace  and  Poisson  equations  seeks  a 
solution  inside  a  bounded  domain  2cM3  subject  to  either  Dirichlet  boundary  conditions , 
prescribing  the  function  values  on  the  domain’s  boundary: 


u  =  h  on  <912,  (12.3) 

or  Neumann  boundary  conditions ,  prescribing  its  normal  derivative  or  flux  through  the 
boundary: 

du 

—  =  k  on  <9fl,  (12.4) 

an 

or  mixed  boundary  conditions ,  in  which  one  imposes  Dirichlet  conditions  on  part  of  the 
boundary  and  Neumann  conditions  on  the  remainder.  Keep  in  mind  that  the  boundary  of 
the  solid  domain  O  consists  of  one  or  more  piecewise  smooth  closed  surfaces,  which  will  be 
oriented  by  use  of  the  outward  —  meaning  exterior  to  the  domain  —  unit  normal  n. 

The  boundary  value  problems  for  the  three-dimensional  Laplace  and  Poisson  equations 
govern  a  wide  variety  of  physical  systems,  including: 


Heat  conduction :  The  solution  u  represents  the  equilibrium  temperature  in  a  solid 
body.  The  inhomogeneity  /  represents  some  form  of  internal  heat  source  or  sink. 
Dirichlet  conditions  correspond  to  fixing  the  temperature  on  the  bounding  sur¬ 
face^),  whereas  homogeneous  Neumann  conditions  correspond  to  an  insulated 
boundary,  i.e.,  one  that  does  not  allow  any  heat  flux. 

Ideal  fluid  flow:  Here  the  solution  u  to  the  Laplace  equation  represents  the  velocity  po¬ 
tential  for  an  incompressible,  irrotational  steady-state  fluid  flow  inside  a  container 
governed  by  the  velocity  vector  held  v  =  Vw.  Homogeneous  Neumann  boundary 
conditions  correspond  to  a  solid  boundary  that  the  fluid  cannot  penetrate. 

Elasticity:  In  certain  restricted  contexts,  u  represents  an  equilibrium  deformation  of 
a  solid  body,  e.g.,  the  radial  deformation  of  an  elastic  ball. 

Electrostatics:  In  applications  to  electromagnetism,  u  is  the  electric  potential  in  a 
conducting  medium;  its  gradient  \7u  prescribes  the  electromotive  force  on  a  charged 
particle.  The  inhomogeneity  /  represents  an  external  electrostatic  force  held. 

Gravitation:  The  Newtonian  gravitational  potential  in  hat  empty  space  is  also  pre¬ 
scribed  by  the  Laplace  equation.  (In  contrast,  Einstein’s  theory  of  general  rela¬ 
tivity  requires  a  vastly  more  complicated  nonlinear  system  of  partial  differential 
equations,  [75].) 
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Self-Adjoint  Formulation  and  Minimum  Principle 


The  Laplace  and  Poisson  equations  naturally  fit  into  the  general  self-adjoint  equilibrium 
framework  summarized  in  Chapter  9.  We  introduce  the  L2  inner  products 


u  .u)  = 


V  ,  V 


n 


n 


u(x ,  y ,  z)  u{x,  y ,  z)  dx  dy  dz , 
v(x,  y,  z)  •  v(x,  y,  z)  dx  dy  dz. 


(12.5) 


between,  respectively,  scalar  fields  u,u,  and  vector  fields  v,v,  which  are  defined  on  the 
domain  O  C  IR3.  We  assume  that  the  functions  in  question  are  sufficiently  nice  in  order 
that  these  inner  products  be  well  defined;  if  O  is  unbounded,  this,  in  essence,  requires  that 
they  decay  reasonably  rapidly  to  zero  at  large  distances. 

When  subject  to  suitable  homogeneous  boundary  conditions,  the  three-dimensional 
Laplace  equation  can  be  placed  in  our  standard  self-adjoint  form 


—  Au  =  —  V  •  Vn  =  V*  o  \7u 


(12.6) 


This  relies  on  the  fact  that  the  adjoint  of  the  gradient  operator  with  respect  to  the  L2  inner 
products  (12.5)  is  minus  the  divergence  operator: 


V*v  =  —  V  •  v. 


(12.7) 

As  usual,  the  determination  of  the  adjoint  rests  on  an  integration  by  parts  formula,  which, 
in  three-dimensional  space,  is  a  consequence  of  the  Divergence  Theorem  from  multivariable 
calculus,  [8,  108]: 

Theorem  12.1.  Let  O  c  IR3  be  a  bounded  domain  whose  boundary  <90  consists 
of  one  or  more  piecewise  smooth  simple  closed  surfaces.  Let  n  denote  the  unit  outward 
normal  to  the  boundary  of  O.  Let  v  be  a  C1  vector  held  dehned  on  O  and  continuous 
up  to  its  boundary  Then  the  surface  integral,  with  respect  to  surface  area,  of  the  normal 
component  of  v  over  the  boundary  of  the  domain  equals  the  triple  integral  of  its  divergence 
over  the  domain: 


v  •  n  dS  = 


V  •  v  dx  dy  dz. 


(12.8) 


on 


n 


Replacing  v  by  the  product  uv  of  a  scalar  field  u  and  a  vector  field  v  yields 


//  /  [u  V  •  v  +  Vu  •  v)  dx  dy  dz  =  /  //  V  •  (uv)  dx  dy  dz  =  //  u  (v  •  n)  dS.  (12.9) 

J  J  Jn  J  J  Jn  J  Jon 

Rearranging  the  terms  produces  the  desired  integration  by  parts  formula  for  triple  integrals: 
//  (Vn-v)  dxdydz=  u(v  n)dS—  u  (V  •  v)  dx  dy  dz.  (12.10) 

J  Jn  J  Jon  J  J  Jn 

The  boundary  surface  integral  will  vanish,  provided  either  n  =  0orv-n  =  0at  each  point 
on  dfl.  When  u  =  0  on  all  of  <90,  we  have  homogeneous  Dirichlet  conditions.  Setting 
v  •  n  =  0  everywhere  on  <90  results  in  the  homogeneous  Neumann  boundary  value  problem 
owing  to  the  identification  of  v  =  X7u.  Finally,  the  mixed  boundary  value  problem  takes 
u  =  0  on  part  of  <90  and  v  •  n  =  0  on  the  rest.  Thus,  subject  to  one  of  these  choices,  the 
integration  by  parts  formula  (12.10)  reduces  to 

(!2.H) 


Vu ,  v )  =  (u  ,  —  V  •  v ), 

which  suffices  to  establish  the  adjoint  formula  (12.7). 
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Remark :  Adopting  more  general  weighted  inner  products  results  in  a  more  general 
elliptic  boundary  value  problem.  See  Exercise  12.1.9  for  details. 

According  to  Theorem  9.20,  the  self-adjoint  formulation  (12.6)  automatically  implies 
positive  semi-definiteness  of  the  boundary  value  problem,  with  positive  definiteness  if 
kerV  =  {0}.  Since,  on  a  connected  domain,  only  constant  functions  are  annihilated  by 
the  gradient  operator  —  see  Lemma  6.16,  which  also  applies  to  three-dimensional  domains 
both  the  Dirichlet  and  mixed  boundary  value  problems  are  positive  definite,  while  the 
Neumann  boundary  value  problem  is  only  positive  semi-definite. 

Finally,  in  the  positive  definite  cases,  Theorem  9.26  implies  that  the  solution  can 
be  characterized  by  the  three-dimensional  version  of  the  Dirichlet  minimization  principle 
(9.82). 

Theorem  12.2.  The  solution  u(x,y,z)  to  the  Poisson  equation  (12.2)  subject  to 
homogeneous  Dirichlet  or  mixed  boundary  conditions  (12.3)  is  the  unique  function  that 
minimizes  the  Dirichlet  integral 


1 

2 


Vu  |||2  -(«,/)  = 


n 


\  ( u2x  +  u\ 2  u2z)  —  f  u  dx  dy  dz 


y 


(12.12) 


among  all  C2  functions  that  satisfy  the  prescribed  boundary  conditions. 


As  in  the  two-dimensional  version  discussed  in  Chapter  9,  the  Dirichlet  minimization 
principle  continues  to  hold  in  the  case  of  the  inhomogeneous  Dirichlet  boundary  value 
problem.  Modifications  for  the  inhomogeneous  mixed  boundary  value  problem  appear  in 
Exercise  12.1.13. 


Exercises 

12.1.1.  Find  bases  for  the  following:  (a)  the  space  of  harmonic  polynomials  u(x,y,z)  of  degree 
<  2;  (b)  the  space  of  homogeneous  cubic  harmonic  polynomials  u(x,y,z). 

12.1.2.  True  or  false:  (a)  Every  harmonic  polynomial  is  homogeneous. 

(b)  Every  homogeneous  polynomial  is  harmonic. 

r\  r\  r\ 

12.1.3.  Solve  the  Poisson  boundary  value  problem  —  A u  =  1  on  the  unit  ball  x  +  y  +  z  <  1 
with  homogeneous  Dirichlet  boundary  conditions.  Hint :  Look  for  a  polynomial  solution. 

0  12.1.4.  Prove  that  if  u(x,y,z)  solves  the  Laplace  equation,  then  so  does  the  translated  function 
U (x,  y,  z)  =  u(x  —  a,y  —  b,  z  —  c)  for  constants  a,  6,  c. 

0  12.1.5.  (a)  Prove  that  if  u(x,y,z)  solves  Laplace’s  equation,  so  does  the  rescaled  function 
U(x,  y,  z)  =  u(Xx,  A y,  X z)  for  any  constant  A.  (b)  More  generally,  show  that 
U(x,  y,  z)  =  gu{ Ax,  Xy,Xz)  +  c  solves  Laplace’s  equation  for  any  constants  A,  /i,  c. 

^  12.1.6.  Let  A  be  a  constant  nonsingular  3x3  matrix,  a(x)  a  C  scalar  field,  and  v(x)  a  C 
vector  field.  Set  L(x)  =  u(Ayf)  and  V(x)  =  v(Ax).  Prove  that 

(a)  V£/(x)  =  4tVw(4x),  (b)  V  •  V(x)  =  ie(Ax),  where  te(x)  =  V  •  (Av)(x). 

0  12.1.7.  Prove  that  every  rotation  and  reflection  is  a  symmetry  of  the  Laplace  equation.  In 
other  words,  if  Q  is  any  3x3  orthogonal  matrix,  so  QTQ  =  I ,  and  u(x)  is  a  harmonic 
function,  then  so  is  U(x)  =  u(Qx).  Hint :  Use  Exercise  12.1.6. 
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r\ 

12.1.8.  The  Weak  Maximum  Principle :  Let  11  C  M  be  a  bounded  domain.  Let  u(x,y,z)  solve 
the  Poisson  equation  —  A u  =  /(x,  y,  z),  where  f(x,  y,z)  <0  for  all  (x,  y,  z)  G  Pi. 

(a)  Prove  that  the  maximum  value  of  u  occurs  on  the  boundary  dPl. 

Hint :  Explain  why  u  cannot  have  a  local  maximum  at  any  interior  point  in  Pi. 

(b)  Generalize  your  result  to  the  case  f(x,y,z)  <  0. 

Hint :  Look  at  v£(x,  y,  z)  =  u{x,  y,z)  -\-  e  ( x 2  +  y2  +  z2)  and  let  6  — x  0+. 


r\ 

§  12.1.9.  Find  the  equilibrium  equations  corresponding  to  minimizing  |  Vw  subject  to  homo¬ 
geneous  Dirichlet  boundary  conditions,  where  the  indicated  norm  is  based  on  the  weighted 
inner  product 


(( V  ,  w  ))  = 


n 


v(x,  y,  z)  •  w(x,  y,  z)  cr(x,  y ,  z)  dx  dy  dz , 


with  cr(x,  y,z)  >  0  a  positive  scalar  function. 


0  12.1.10.  Prove  the  following  vector  calculus  identities: 

(a)  V-  (u  v)  =  Va  •  v  +  u  V  •  v,  (b)  Vx(uv)=Vuxv|tiVxv, 

(c)  V  •  (v  x  w)  =  (V  x  v)  •  w  —  v  •  (V  x  w),  (d)  V  x  (V  x  v)  =  V(V  •  v)  —  Av. 

(In  the  final  term,  the  Laplacian  A  acts  component- wise  on  the  vector  field  v.) 


^  12.1.11.  Let  Pi  be  a  bounded  domain  with  piecewise  smooth  boundary  dPl.  Prove  the  following 
identities:  (a)  J J Audxdydz  =  j 'J'  dS , 


dQ  dn 


(b)  J j u  Au  dx  dy  dz  =  JJ^u^^dS  — 


Va  \\  dx  dy  dz. 


dn  JJJQ 

12.1.12.  Suppose  the  inhomogeneous  Neumann  boundary  value  problem  (12.1,  4)  has  a  solution. 

(a)  Prove  that  //  kdS  —  0.  (b)  Is  the  solution  unique?  If  not,  what  is  the  most  general 

J  J  dfl 

solution?  (c)  State  and  prove  an  analogous  result  for  the  inhomogeneous  Poisson  equation 
—  Au  =  /(x,  y,  z).  (d)  Provide  a  physical  explanation  for  your  answers. 


^  12.1.13.  Find  a  minimization  principle  that  characterizes  the  solution  to  the  inhomogeneous 
mixed  boundary  value  problem  —  Au  =  f  on  Pi,  with  u  =  g  on  D  C  dPl,  and  du/d n  =  h 

on  N  =  dPl  \  D. 

P?  12.1.14.  (a)  Prove  that,  subject  to  suitable  boundary  conditions,  the  curl  Vx  defines  a  self- 

o 

adjoint  operator  with  respect  to  the  L  inner  product  between  vector  fields.  What  kinds 
of  boundary  conditions  do  you  need  to  impose  for  your  integration  by  parts  argument  to  be 
valid?  Hint:  Use  the  identity  in  Exercise  12.1.10(c).  (b)  What  operator  on  vector  fields  is 

given  by  the  self-adjoint  composition  S  =  (Vx)*  o  (V  x)?  (c)  Choose  a  set  of  homoge¬ 

neous  boundary  conditions  that  make  S  self-adjoint.  Is  the  resulting  boundary  value 
problem  S[v]  =  f  positive  definite?  If  not,  what  does  the  Fredholm  Alternative  say  about 
its  solvability? 


12.2  Separation  of  Variables  for  the  Laplace  Equation 

In  this  section,  we  revisit  the  method  of  separation  of  variables  in  the  context  of  the  three- 
dimensional  Laplace  equation.  As  always,  its  applicability  is  unfortunately  restricted  to 
rather  special,  but  important,  geometric  configurations,  the  simplest  being  rectangular, 
cylindrical,  and  spherical  domains.  Since  the  first  two  are  straightforward  extensions  of 
their  two-dimensional  counterparts,  we  will  discuss  only  spherically  separable  solutions  in 
any  detail. 

The  simplest  domain  to  which  the  separation  of  variables  method  applies  is  a  rectan- 
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gular  box: 

E>  =  {0  <  x  <  a,  0  <  y  <  b,  0  <  z  <  c}. 

For  functions  of  three  variables,  one  begins  the  separation  process  by  splitting  off  one  of 
them,  by  setting  u(x,  y ,  z)  —  v(x)  w(y,  z),  say.  The  function  v(x)  satishes  a  simple  second- 
order  ordinary  differential  equation,  while  w(y,z)  solves  the  two-dimensional  Helmholtz 
equation  (11.21),  which  is  further  separated  by  writing  w(y,z)  =  p{y)  q(z ).  The  resulting 
fully  separated  solutions  u(x,y,z)  =  v(x)  p{y)  q(z)  are  (mostly)  products  of  trigonometric 
and  hyperbolic  functions.  Implementation  of  the  technique  and  analysis  of  the  resulting 
series  solutions  are  relegated  to  Exercise  12.2.34. 

In  the  case  that  the  domain  is  a  cylinder,  one  passes  to  cylindrical  coordinates  r,  0,  z, 
where 

x  =  rcos0,  y  =  rsm9,  z  —  z^  (12.13) 

to  effect  the  separation.  Writing  u(r,6,z)  =  v(r,  0)w(z),  one  finds  that  w(z)  satishes  a 
simple  second-order  ordinary  differential  equation,  while  u(r,  9)  solves  the  two-dimensional 
polar  Helmholtz  equation  (11.51)  on  a  disk.  Applying  a  further  separation  to  u(r,  0),  as 
in  Chapter  11,  produces  fully  separable  solutions  u(r,  9 ,  z)  —  p{r)  q{9)  w(z)  as  products  of 
Bessel  functions  of  the  cylindrical  radius  r,  trigonometric  functions  of  the  polar  angle  0, 
and  hyperbolic  functions  of  z;  see  Exercise  12.2.40. 

The  most  interesting  case  is  that  of  spherical  coordinates,  which  we  proceed  to  analyze 
in  detail  in  the  following  subsection. 

Remark :  These  are  just  three  of  the  many  coordinate  systems  in  which  the  three- 
dimensional  Laplace  equation  separates.  See  [78,  79]  for  37  additional  exotic  types,  in¬ 
cluding  ellipsoidal,  toroidal,  and  parabolic  spheroidal  coordinates.  The  resulting  separable 
solutions  are  written  in  terms  of  new  classes  of  special  functions  that  solve  interesting 
second-order  ordinary  differential  equations,  all  of  Sturm-Liouville  form  (9.71). 


Laplace’s  Equation  in  a  Ball 

Suppose  a  solid  ball  (e.g.,  the  Earth)  is  subject  to  a  specified  steady  temperature  distri¬ 
bution  on  its  spherical  boundary.  Our  task  is  to  determine  the  equilibrium  temperature 
within  the  ball.  We  assume  that  the  body  is  composed  of  an  isotropic,  uniform  medium 
and,  to  slightly  simplify  the  analysis,  choose  units  in  which  its  radius  equals  1. 

To  find  the  equilibrium  temperature  within  the  ball,  we  must  solve  the  Dirichlet  bound¬ 
ary  value  problem 


d2u  d2u  d2u 

dx2  dy 2  +  dz 2  ’ 

u(x,y,z)  =  h(x,y,z ), 


2  ,  2  ,  2-i 

x  +y  +  z  <  1, 

2,2,2  1 
x  +  y  +  z  =  1, 


(12.14) 


where  h  is  prescribed  on  the  bounding  unit  sphere.  Problems  in  spherical  geometries  are 
most  naturally  analyzed  in  spherical  coordinates  r,tp,9.  Our  convention  is  to  set 


x  =  r  sin  p  cos  0,  y  =  r  sin  p  sin  0,  z  —  r  cost/?,  (12.15) 

where  —  7r  <  9  <  7r  is  the  azimuthal  angle  or  longitude ,  while  0  <  p  <  tt  is  the  zenith 
angle  or  latitude  on  the  sphere  of  radius  r  =  \J x2  +  y2  +  z2  .  In  other  words,  measures 
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Figure  12.1.  Spherical  coordinates. 


the  angle  between  the  vector  (Kx1y1z)T  and  the  positive  z- axis,  while  8  measures  the 

angle  between  its  projection  (x,y,0)T  on  the  (x,  y) -plane  and  the  positive  x-axis;  see 
Figure  12.1.  On  Earth,  longitude  8  is  measured  from  the  Greenwich  prime  meridian,  while 
latitude  is  measured  from  the  equator,  and  so  equals  \  n  —  p  (although  the  everyday  units 
are  degrees,  not  radians),  whereby  p  is  sometimes  referred  to  as  the  “co-latitude”. 

Warning :  In  many  books,  particularly  those  in  physics,  the  roles  of  8  and  p  are  re¬ 
versed ,  leading  to  much  confusion  when  one  is  perusing  the  literature.  We  prefer  the 
mathematical  convention,  since  the  azimuthal  angle  8  coincides  with  the  cylindrical  angle 
coordinate  —  as  well  as  the  polar  coordinate  on  the  (x,  ?/)-plane  —  thus  avoiding  unneces¬ 
sary  confusion  when  going  from  one  coordinate  system  to  the  other.  You  must  be  attentive 
to  the  convention  being  used  when  consulting  any  reference! 

In  spherical  coordinates,  the  Laplace  equation  for  u(r,  p,  8)  takes  the  form 


.  d2u  2  du  1 
Au  + 


d2u  cosp  du 

+  o  ■  + 


dr2  r  dr  r2  dp2  r2  sin  p  dp 


2  •  2 
rz  sm  p 


d2u 

dd2 


=  0, 


(12.16) 


This  important  formula  is  the  final  result  of  a  fairly  nasty  chain  rule  computation,  whose 
details  are  left  to  the  motivated  reader.  (Set  aside  lots  of  paper  and  keep  an  eraser  handy!) 

To  construct  separable  solutions  to  the  spherical  coordinate  form  (12.16)  of  the  Laplace 
equation,  we  begin  by  separating  off  the  radial  part  of  the  solution,  setting 


u(r,  (/?,  8)  =  v(r)  w(p,  8). 


(12.17) 


r2 

Substituting  this  ansatz  into  (12.16),  multiplying  the  resulting  equation  through  by  — -  , 
and  then  placing  all  the  terms  involving  r  on  one  side  yields 


1 

v 


—  —  A 

w 


si 


w 


(12.18) 


A 


d2w  cos  p  dw  1  d2w 
dp2  ^  sin  p  dp  sin2  p  d82 


(12.19) 


where 
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The  second-order  differential  operator  Ag,  which  involves  only  the  angular  components 
of  the  full  Laplacian  operator  A,  is  of  particular  significance.  It  is  known  as  the  spher¬ 
ical  Laplacian ,  and  governs  the  equilibrium  and  dynamics  of  thin  spherical  shells  —  see 
Example  12.15  below. 

Returning  to  equation  (12.18),  our  usual  separation  argument  applies.  The  left-hand 
side  depends  only  on  r,  while  the  right-hand  side  depends  only  on  the  angles  p,0.  This  can 
occur  only  when  both  sides  are  equal  to  a  common  separation  constant,  which  we  denote  by 
fi.  As  a  consequence,  the  radial  component  v(r)  satisfies  the  ordinary  differential  equation 

r2  v"  +  2rv'  —  pv  =  0,  (12.20) 

which  is  of  Euler  type  (11.89),  and  hence  can  be  readily  solved.  However,  let  us  put  this 
equation  aside  for  the  time  being,  and  concentrate  our  efforts  on  the  more  complicated 
angular  components. 

The  second  equation  in  (12.18)  assumes  the  form 


A^[re]  +  p  w 


d2w 


d2w  cos  p  dw  1 

dtp2  sin  p  dp  sin2  p  d62 


+  p  w  =  0. 


(12.21) 


This  second-order  partial  differential  equation  can  be  regarded  as  the  eigenvalue  equation 
for  the  spherical  Laplacian  operator  As  and  is  known  as  the  spherical  Helmholtz  equation. 
To  find  explicit  solutions,  we  adopt  a  further  separation  of  angular  variables, 


w(ip,6)  =  p(ip)q(6),  (12.22) 

which  we  substitute  into  (12.21).  Dividing  the  result  by  the  product  w  =  pq1  multiplying 
by  sin  p,  and  then  rearranging  terms,  we  are  led  to  the  separated  system 

1  (  .  2  d2P  ,  •  dp\  -2  1  d2q 

-  sin  ip  — r  +  cos  p  sin  p  —  +/ism  <p  — - — r  =  v, 

p  \  dpz  dp  J  q  ddz 

where,  by  our  usual  argument,  v  is  another  separation  constant.  The  spherical  Helmholtz 
equation  thereby  splits  into  a  pair  of  ordinary  differential  equations 


•  2  d2P  ,  •  dp  2 

sin  p  r  +  cos  p  sin  p  — - b  [u  sin  p  —  v)  p  —  0, 

dpz  dp 


+  v  q  =  0. 


The  equation  for  q{9)  is  easy  to  solve.  As  one  circumnavigates  the  sphere,  the  azimuthal 
angle  9  increases  from  —  n  to  tt,  so  q(6)  must  be  a  27r-periodic  function.  Thus,  q{9)  solves 
the  well-studied  periodic  boundary  value  problem  treated,  for  instance,  in  (4.109).  Up  to 
a  constant  multiple,  nonzero  periodic  solutions  occur  only  when  the  separation  constant 
assumes  one  of  the  values  v  —  m2,  where  m  —  0, 1,  2, . . .  is  an  integer,  with 


q(9)=  cosm9  or  sin  ra0,  m  —  0,1,2,....  (12.23) 

Each  positive  v  —  m2  >  0  admits  two  linearly  independent  27r-periodic  solutions,  while 
when  v  —  0,  only  the  constant  solutions  are  periodic. 


The  Legendre  Equation  and  Ferrers  Functions 


With  this  information,  we  endeavor  to  solve  the  zenith  differential  equation 

.  2  d2p  .  dp  2  2 

sm  p  -  +  cos  p  sm  p  — - b  (fi  sm  p  —  m  )  p  =  0. 

dpz  dp 


(12.24) 
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This  is  not  so  easy,  and  constructing  analytic  formulas  for  its  solutions  requires  some 
ingenuity.  The  motivation  behind  the  following  steps  may  not  be  so  apparent;  indeed, 
they  are  the  culmination  of  a  long,  detailed  study  of  this  important  differential  equation 
by  mathematicians  over  the  last  200  years. 

As  an  initial  simplification,  let  us  get  rid  of  the  trigonometric  functions,  by  invoking 
the  change  of  variables 


t  =  cos  (/?, 

Since 

0  <  (/?  <  7T, 

According  to  the  chain  rule, 

dp  dP  dt 
dp  dt  dp 


with 


we  have 


p(p)  =  P(cosp)  =  P(t) 


0  <  yl—t2  =  sin  p  <  1 


sin^ 


dP 

dt 


=  —  \J\  —  t2 


dP 

dt 


d2p  d  (  r - -  dP 

— ^  =  -sin p—  (  -  Vl-t2 


=  (i-t2) 


d2P 


~  t 


dP 


(12.25) 


dp2  dt  V  dt  J  7  dt2  dt 

Substituting  these  expressions  into  (12.24),  we  conclude  that  P(t)  must  satisfy 


(l-t2)2  P?--2t(l-t2)P+\v(l-t2)-m2 


dt 2 


dt 


P  =  0. 


(12.26) 


Unfortunately,  the  resulting  differential  equation  is  still  not  elementary,  but  at  least  its 
coefficients  are  polynomials.  It  is  known  as  the  Legendre  differential  equation  of  order  m, 
having  first  been  employed  by  Adrien-Marie  Legendre  to  study  the  gravitational  attraction 
of  ellipsoidal  bodies.  In  the  cases  of  interest  to  us,  the  order  parameter  m  is  an  integer, 
while  the  separation  constant  p  plays  the  role  of  an  eigenvalue. 

Power  series  solutions  to  the  Legendre  equation  can  be  constructed  by  the  standard 
techniques  presented  in  Section  11.3.  The  most  general  solution  is  a  new  type  of  special 
function,  called  a  Legendre  function ,  [86].  However,  it  turns  out  that  the  solutions  we  are 
actually  interested  in  can  all  be  written  in  terms  of  elementary  algebraic  functions.  First 
of  all,  since  t  =  cos  p,  the  solution  only  needs  to  be  defined  on  the  interval  —  1  <  t  <  1, 
the  so-called  cut  locus.  The  endpoints  of  the  cut  locus,  t  —  1  and  t  —  —  1,  correspond  to 
the  sphere’s  north  pole,  p  —  0,  and  south  pole,  p  —  tt,  respectively.  Both  endpoints  are 
singular  points  for  the  Legendre  equation,  since  the  coefficient  (1  —  t2)2  of  the  leading-order 
derivative  vanishes  when  t  =  =b  1.  In  fact,  both  are  regular  singular  points,  as  you  are  asked 
to  show  in  Exercise  12.2.11.  Since  ultimately  we  need  the  separable  solution  (12.17)  to  be  a 
well-defined  function  of  x,y,z  (even  at  points  where  the  spherical  coordinates  degenerate, 
i.e.,  on  the  z-axis),  we  need  p{p)  to  be  well  defined  at  p  =  0  and  7r,  and  this  requires  P(t) 
to  be  bounded  at  the  singular  points: 


P(—  1)  |  <  oo. 


P(+ 1)  |  <  oo. 


(12.27) 


Let  us  begin  our  analysis  with  the  Legendre  equation  of  order  m  —  0 


d2P 


dP 


(1  —  tz)  ——z —  2 1  — — b  pP  =  0. 


(12.28) 


dt2  dt 

In  this  case,  the  eigenfunctions,  i.e.,  solutions  to  the  Legendre  boundary  value  problem 
(12.27-28),  are  the  Legendre  polynomials 

(-l)n  dn 


pn(t)  = 


2 \n 


(i-t2) 


2n  n !  dtn 


(12.29) 
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Figure  12.2.  Legendre  polynomials. 


with  corresponding  eigenvalue  parameter  (i  —  n(n  +  1).  (The  initial  factor  is  by  common 
convention,  [86]  ;  see  (12.64)  for  the  explicit  formula.)  The  first  few  are 


P0(t)  =  l,  P1(t)=t,  P2(t)  =  |t2-i,  P3(t)  =  ft3  -  ft, 

PS)  =  %*-%*  + 1  PS)  =  ft5-ft3  +  ft, 

and  are  graphed  in  Figure  12.2. 

Each  Legendre  polynomial  clearly  satisfies  the  boundary  conditions  (12.27).  To  verify 
that  they  are  indeed  solutions  to  the  differential  equation  (12.28),  we  set 


2 \n 


Qn(t)  =  (  1-n 


By  the  chain  rule,  the  derivative  of  Qn(t)  is 


Q'n  =  -2nt(l  -  r) 


2\n  — 1 


and  hence 


(1  —  t2)Q'n  =  —  2nt  (1  —  t2)n  =  —2  ntQ 


n 


Differentiating  the  latter  formula  yields 


(l-t2)Q"-2tQ'n  =  -2ntQ'n-2nQn,  or  (1  -  t2)Q"n  =  -2(n  -  l)tQ'n  -  2nQ 

dkQ 

A  simple  induction  proves  that  the  kth  order  derivative  (t)  =  satisfies 


n 


(1  -  t2)Q(£+2'>  =  -2(n  —  k  —  1  ~  2[n  +  (n  -  1)  H - f  (n  -  k)] 

=  —2(n  -k-  1  )tQ(nk+1)  ~(k  +  1)(2  n  -  k)Qf\ 


(12.30) 
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In  particular,  when  k  =  n,  this  reduces  to 

(1  -  t2)Q£+V  =  2 tQ^  -n(n  +  1)QW  =  0, 
and  so  Pn  ( t )  =  Q (t)  satisfies 


(!  -  t2)P"  ~  2tPn  +  n{n  +  l)Pn  =  0, 

which  is  precisely  the  order  0  Legendre  equation  (12.28)  with  fi  =  n(n+  1).  The  Legendre 

polynomial  Pn  is  a  constant  multiple  of  Pn,  and  hence  it  too  satisfies  the  order  0  Legendre 
equation.  According  to  Theorem  12.3  below,  the  Legendre  polynomials  form  a  complete 
system  of  eigenfunctions  for  the  order  0  Legendre  boundary  value  problem. 

When  the  order  m  >  0,  the  eigenfunctions  of  the  Legendre  boundary  value  problem 
(12.26-27)  are  not  always  polynomials.  They  are  known  as  the  Ferrers  functions ,  named 
after  the  nineteenth-century  British  mathematician  Norman  Ferrers,  or,  more  generally,  as 
associated  Legendre  functions.  They  have  the  explicit  formula’*' 


(i  -  t2)m/2 


(i  -t2)™/ 2 
2  nn\ 


gn-\-m 

dfn+m 


(1-t2) 


n  =  771,  771  +  1, ... ,  (12.31) 


which  generalizes  the  formula  (12.29)  for  the  Legendre  polynomials.  The  eigenvalue  pa¬ 
rameter  for  Pfff  (t)  is  also  fi  =  n{n  +  1).  In  particular  P®(t)  =  Pn(t).  Here  is  a  list  of  the 
first  few  Ferrers  functions,  which,  for  completeness,  includes  Legendre  polynomials: 


pg(t)  =  i, 

pp)  =  -\  +  ¥2 


p£(t)  =  -p  + 
p£(t)  =  15t  (1  —  t2), 

p£(t)  =  l-^r  +  fr 

pl(t)  =  (-¥  +  '-¥  t2)(i-t2) 


5  .3 


p£{t)  =  st  \[\  - 12 , 
p£(t)  =  {-l  +  ft2)Vi^e 
pl(t)  =  iHi-t2 


pI  (t)  =  Li  - 12 , 

P2(t)  =  3(1  -t2) 


(12.32) 


35  *4 


Pi(t)  =  (-¥«  +  f*3)  vrrp 


PHt )  =  1054  (1  -  t- 


2\2 


P44(t)  =  105(1-0 


When  m  =  2  k  <  n  is  an  even  integer,  P™(t)  is  a  polynomial  function,  while  when  m  = 

2  k  +  1  <  n  is  odd,  there  is  an  extra  factor  of  y/l  —  t2  .  Keep  in  mind  that  the  square  root 
is  real  and  positive,  since  we  are  restricting  our  attention  to  the  interval  —  1  <  t  <  1.  If 
m  >  n,  formula  (12.31)  reduces  to  the  zero  function  and  so  is  not  included  in  the  final 
tally. 


Warning :  Even  though  half  of  the  Ferrers  functions  are  polynomials,  only  those  with 
m  —  0,  i.e.,  Pn(t)  =  P^ (t) ,  are  called  Legendre  polynomials. 


Warning :  Some  authors  include  a  (— l)m  factor  in  the  formula,  resulting  in  the  opposite  sign 
when  m  is  odd.  Another  source  of  confusion  is  that  many  tables  define  the  associated  Legendre 

functions  using  the  alternative  initial  factor  ( t 2  —  l)m/2.  But  this  is  unsuitable,  since  we  are  solely 
interested  in  values  of  t  lying  in  the  interval  —  1  <  t  <  1,  and  this  convention  would  result  in  a 
complex- valued  function  when  m  is  odd.  Following  [86],  we  use  the  term  “Ferrers  function”  to 
refer  to  the  restriction  of  the  associated  Legendre  function  to  the  cut  locus  —1  <  t  <  1. 
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o  <  Pl  it)  <  1 


-1.5  <  Pl  (t)  <  1.5 


o  <  p:j(t)  <  3 


-1.5  <  Pl(t)  <  2.07 


-5.77  <  P32(t)  <  5.77 


0  <  P$(t)  <  15 


-2.64  <  Pl(t)  <  2.64 


-7.5  <  Pl(t)  <  9.64 


-34.1  <  P|(7)  <  34.1 


0  <  P£(t)  <  105 


Figure  12.3.  Ferrers  functions. 


Figure  12.3  displays  graphs  of  the  Ferrers  functions  P™J (t)  for  1  <  m  <  n  <  4. 
Pay  particular  attention  to  the  fact  that,  owing  to  the  choice  of  normalization  factor,  the 
graphs  have  very  different  vertical  scales,  as  indicated  by  their  minimum  and  maximum 
values  (rounded  to  two  decimal  places)  written  below  each  —  although  one  always  has  the 
freedom  to  rescale  the  eigenfunctions  as  desired,  e.g.,  so  as  to  be  orthonormal. 

To  show  that  the  Ferrers  functions  P™(t)  satisfy  the  Legendre  differential  equation 
(12.26)  of  order  m,  we  substitute  k  =  m  +  n  in  (12.30): 

j 2  Dm  ,/  Dm 

(1  —  t 2)  — -P- —  2  (m  +  1)  t  — P—  +  (m  +  n  +  1)  (n  —  m)  Erri  =  0,  (12.33) 

dt2  dt 


where 


K(t)  =  Q(nm+n)(t). 


This  is  not  the  order  m  Legendre  equation,  but  it  can  be  converted  into  it  by  setting 

R™(t)  =  (l-t2)~m/2S™(t). 
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Differentiating,  we  obtain 


dK 

=  (1-L 

,  /9  dS ™ 

2\  —  m/2  n 

dt 

dt 

d2R™ 

=  (l-t 

r 2  qrn 
2\-m/ 2  a 

c-L 

to 

]  dt 2 

-mi(l-i2)_m/2_1  S™ 
■  -  2mt(l  -  t2)~ml2~'y 


dSl 

dt 


+  m  +  m(m  +  1)£2  ]  (1  —  t2)  m^2  2  S™ , 
Therefore,  after  a  little  algebra,  equation  (12.33)  takes  the  alternative  form 


d2S, 


rn 

n 


(1  —  -j.2)  —  m/2  +  l  _2t(l-t2)-m/2 


dt 2 


ds r 

dt 


+  [n(n  +  1)  (1  —t2)  —  to2]  (1  -  t2)-™/2"1  S™  =  0, 


which,  when  multiplied  by  (1  — t2)m/2+1,  is  precisely  the  order  m  Legendre  equation  (12.26) 
with  (i  —  n  {n  +  1) .  Thus, 


(1  -t2)m/2R™(t)  =  (1  -t2)m/2 


n-\-m 

dfn+m 


2 \n 


(i  -n 


which  is  a  constant  multiple  of  the  Ferrers  function  P™(t),  is  a  solution  to  the  order  m 
Legendre  equation.  Moreover,  we  note  that 


P^(l)  =  P™(-1)  =  0,  when  m  >  0,  (12.34) 

and  we  conclude  that  P™(t)  is  an  eigenfunction  for  the  order  m  Legendre  boundary  value 
problem. 

The  following  result  states  that  the  Ferrers  functions  provide  a  complete  list  of  solu¬ 
tions  to  the  Legendre  boundary  value  problem  (12.26-27). 


Theorem  12.3.  Let  m  >  0  be  a  nonnegative  integer.  Then  the  order  m  Legendre 
boundary  value  problem  prescribed  by  (12.26-27)  has  eigenvalues  \in  =  n  (n  +  1)  for  n  = 
0,1,2,...,  and  associated  eigenfunctions  P™(t),  where  m  =  0, . . . ,  n.  Moreover ,  the  Ferrers 
eigenfunctions  form  a  complete  orthogonal  system  relative  to  the  L2  inner  product  on  the 
cut  locus  [—1,1  . 


Returning  to  the  zenith  variable  ip  via  (12.25),  Theorem  12.3  implies  that  our  original 
boundary  value  problem 


•  2  d2P  ,  •  dp  2  2 

sm  (p  — —  +  cos  cp  sm  <p  — - h  (fi  sm  p  —  m  )  p  =  0, 

dpz  dp 


|p(0)|,  |p(7r)|<oo,  (12.35) 


has  its  eigenvalues  and  eigenfunctions  expressed  in  terms  of  the  Ferrers  functions: 


gn  =  n(n  +  1),  p™ (v)  =  P™ (c°s v),  for  0  <  m  <  n.  (12.36) 

Since  P™{t)  is  either  a  polynomial  or  a  polynomial  multiplied  by  a  power  of  y/l  —  t2  , 
the  eigenfunction  pj^( p)  is  a  trigonometric  polynomial  of  degree  n,  which  we  call  a  trigono- 
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Figure  12.4.  Trigonometric  Ferrers  functions. 


metric  Ferrers  function.  Here  are  the  first  few,  written  in  Fourier  form,  as  in  (3.38): 


PoO) 

p°(^) 

p°(p) 

P4O) 

d(<p) 


=  1. 


P?(<p)  =  COS(/9, 

P2O)  =  §  sin  2(/?, 


=  i  +  |  cos  2^9, 

=  |  COS  1/9  +  |  COS  3  </9 


p\{v)  =  sin<^, 
pKv)  =  |  -  |cos2 <p, 


8 

15 


ID  1  5  Q 

=  ^  cos  p  —  cos  3  p 


4  4  ^  Y-'  ? 

^  COS  2  </9  +  If  cos4</9 

45  ,  15_o,„  105 

16 
315 


f|  +  f  COS2V9- 


8 


16  cos  4  <p, 
cos  2  ip  +  ^  cos  4  </?. 


pJ(p)  =  |sin^+  ff  sin  3</9, 

pI(<p)  =  xsin(P  -  xsin3P 
pKv)  -- 
pK<p)  = 


8 

4f  sin  ¥>  -  T 

|  sin  2  p  +  ||  sin  4  (/?. 

105  •  105 


sin  2p  —  ^  sin  4  p 


(12.37) 


It  is  also  instructive  to  plot  the  eigenfunctions  in  terms  of  the  zenith  angle  p\  see  Figure  12.4. 
As  in  Figure  12.3,  the  vertical  scales  are  not  the  same,  as  indicated  by  the  listed  minimum 
and  maximum  values. 
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Spherical  Harmonics 


At  this  stage,  we  have  determined  both  angular  components  of  our  separable  solutions 
(12.22).  Multiplying  the  two  parts  together  results  in  the  spherical  angle  functions 


Yn  (v ,  Q)  =  Pn  (v)  cosrne,  n  =  0,1,2,..., 

Y™(<p,  8)  —  p™{<p)  sin  md,  m  =  0,l,...,n, 

known  as  spherical  harmonics.  They  satisfy  the  spherical  Helmholtz  equation 

AsY™+n(n  +  l)Y™  =  0  =  As  Y™  +  n  (n  +  1)  Ynm, 

and  so  are  eigenfunctions  for  the  spherical  Laplacian  operator,  (12.19),  with  associated 
eigenvalues  p,n  =  n{n  +  1)  for  n  =  0, 1,  2, ...  .  The  nth  eigenvalue  pn  admits  a  (2 n  +  1)- 
dimensional  eigenspace,  spanned  by  the  spherical  harmonics 

Yn(^O),  ,  YZ{<p,6),  YnV,0),  ...  ,  ?nnM- 

(The  omitted  function  Y®((p,0)  =  0  is  trivial,  and  so  does  not  contribute.)  In  Figure  12.5 
we  plot  the  first  few  spherical  harmonic  surfaces  r  =  Y™(p,  9).  In  these  graphs,  in  view  of 
the  spherical  coordinate  formulae  (12.15),  points  with  a  negative  r  coordinate  appear  on 
the  opposite  side  of  the  origin  from  their  positive  r  counterparts.  Incidentally,  the  graphs  of 
the  other  spherical  harmonic  surfaces  r  =  Y™(p,  0),  when  m  >  0,  are  obtained  by  rotation 
around  the  z- axis  by  90°;  see  Exercise  12.2.20.  On  the  other  hand,  the  graphs  of  Y®  are 
cylindrically  symmetric  (why?),  and  hence  unaffected  by  such  a  rotation. 

Self-adjointness  of  the  spherical  Laplacian,  as  per  Exercise  12.2.21,  implies  that  the 
spherical  harmonics  are  orthogonal  with  respect  to  the  L2  inner  product 

( f,9)=[[  fgdS=[  [  f(v,0)g(ip,6)smip  dipdO  (12.40) 

J  J  Sl  J  —  7T  J  0 

given  by  integrating  the  product  of  the  functions  with  respect  to  the  surface  area  element 
dS  =  sin  (p  dp  dO  on  the  unit  sphere  *Sf1  =  {  ||  x  ||  =  1 }.  More  correctly,  self-adjointness  only 
guarantees  orthogonality  of  the  harmonics  corresponding  to  distinct  eigenvalues:  gn  ^  p t. 
However,  the  orthogonality  relations 

(Y™,Ylk)=ff  Y™  Yk  dS  =  0,  for  (m,n)^(k,l), 

a  j  s  i 

(Y™  ,Ytk )  =  J Js  Y™  Yk  dS  =  0,  for  all  (m,n),  (M),  (12.41) 

(Y™,Ylk)=ff  Y™YkdS  =  0,  for  (m,  n)  ±  (k,  l), 

a  a  s  i 

do,  in  fact,  hold  in  full  generality;  Exercise  12.2.22  asks  you  to  supply  the  details.  Moreover, 
their  norms  can  be  explicitly  computed: 


(12.38) 


(12.39) 


Y 


o 


n 


47T 


2n  +  1 


Y 


m 


n 


Y 


rri 


n 


2i x(n  +  m) ! 
(2n  +  l)(n  —  m) ! 


m  —  1 


n. 


(12.42) 


Proofs  of  the  latter  formulae  are  outlined  in  Exercise  12.2.24. 

With  some  further  work,  it  can  be  shown  that  the  spherical  harmonics  form  a  complete 
orthogonal  system  of  functions  on  the  unit  sphere.  This  means  that  any  reasonable  (e.g., 
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Yf(<p,e) 
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piecewise  C1  or  even  L2)  function  h:  S1  — >>  M,  can  be  expanded  into  a  convergent  spherical 
harmonic  series 


h(<p,0) 


c 


oo 


0,0 


+  E 


c 


n 


k,nW.9)+^,r(^)  •  (12.43) 


n—  IV  m— 1 


Applying  the  orthogonality  relations  (12.41),  we  find  that  the  spherical  harmonic  coeffi¬ 
cients  are  given  by  the  inner  products 


c 


0  ,n 


2 (h,Y% 

II  Y°  II2 

I  ±  n 


C 


m,n 


h  Yrn 

11  7  1  n 

Wra 

±  n 


C 


(h,Yr 


m 

n 


m,n 


W  rn 

±  n 


0  <  77, 

1  <  m  <  n. 


or,  explicitly,  using  (12.40)  and  the  formulae  (12.42)  for  the  norms. 


c 


c 


(2n  +  l)(n  —  m) ! 
171,71  27r(n  +  m)! 

(2n  +  l)(n  —  m) ! 


ra,n 


2tt  (n  +  m) 


! 


*7T  /‘7T 

-7T  J 0 
*7 T  rTT 

-7T  J 0 


/i(y?,  0)  p™(<p)  cos  rn#  sin^?  dtpdO. 


h(p> ,  9)  p™ sin  777  0  sin(/?  dp>d9. 


(12.44) 


As  with  an  ordinary  Fourier  series,  the  extra  |  was  appended  to  the  c0  n  terms  in  (12.43) 
so  that  equations  (12.44)  remain  valid  for  all  values  of  777,77.  In  particular,  the  constant 
term  in  the  spherical  harmonic  series  is  the  mean  of  the  function  /7  over  the  unit  sphere: 


'0,0 


-b  ff  hdS=2- 

47T  JJsi  47 r 


*7T  /*7T 


h(<p,  9)  sirup  dtp  dO. 


(12.45) 


-7T  J  0 


Remark :  Establishing  uniform  convergence  of  a  spherical  harmonic  series  (12.43)  is 
more  challenging  than  in  the  Fourier  series  case,  because,  unlike  the  trigonometric  func¬ 
tions,  the  orthonormal  spherical  harmonics  are  not  uniformly  bounded.  A  recent  survey  of 
what  is  known  in  this  regard  can  be  found  in  [10]. 

Remark :  An  alternative  approach  is  to  replace  the  real  trigonometric  functions  by 
complex  exponentials,  and  work  with  the  complex  spherical  harmonics t 


y?('P,e)  =  Y?(<p,0)+iY?('p,6)=p’Z(<p)e 


i  rn  6 


77  =  0,  1,  2,  .  .  .  , 
777  =  —  77,  — 77+1 


(12.46) 


,  .  .  .  , 


77. 


The  associated  orthogonality  and  expansion  formulas  are  relegated  to  the  exercises. 


Harmonic  Polynomials 

To  complete  our  solution  to  the  Laplace  equation  on  the  solid  ball,  we  still  need  to  solve  the 
ordinary  differential  equation  (12.20)  for  the  radial  component  v(r).  In  view  of  our  analysis 
of  the  spherical  Helmholtz  equation,  the  original  separation  constant  is  y  =  77(77  +  1)  for 
some  nonnegative  integer  n  >0,  and  so  the  radial  equation  takes  the  form 

T*2  v"  +  27*7/  —  77  (77  +  1)  V  =  0.  (12.47) 


t  Here  we  use  the  convention  that  Y7n  =  Yn  m,  Y7n  =  —  Yn  m,  and  Y®  =  0,  which  is 
compatible  with  their  defining  formulas  (12.38). 
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To  solve  this  Euler  equation,  we  substitute  the  power  ansatz  v(r)  =  ra,  and  find  that  the 
exponent  a  must  satisfy  the  quadratic  indicial  equation 

a2  +  a  —  n  (n  +  1)  =  0,  and  hence  a  =  n  or  a  =  —  (n+1). 
Therefore,  the  two  linearly  independent  solutions  are 


v1(r)  =  rn  and  u2(r)  =  r  n  1.  (12.48) 

Since  we  are  currently  interested  only  in  solutions  that  remain  bounded  at  r  =  0  —  the 
center  of  the  ball  —  we  will  retain  just  the  first  solution  v(r)  =  rn  for  our  subsequent 
analysis. 

At  this  stage,  we  have  solved  all  three  ordinary  differential  equations  for  the  separa¬ 
ble  solutions.  We  combine  (12.23,38,48)  to  produce  the  following  spherically  separable 
solutions  to  the  Laplace  equation: 


=  rn  Y™  ((/?,  9)  =  rn  p™ (p)  cos  m0. 
H™  =  rn  Y™  ((/?,  6)  =  rn  p™  (p)  sin  m  9 , 


n  =  0, 1,  2, 

m  —  0, 1, . . 


n. 


(12.49) 


Although  apparently  complicated,  these  solutions  are,  perhaps  surprisingly,  elementary 
polynomial  functions  of  the  rectangular  coordinates  x,  y,  z,  and  hence  are  harmonic  poly¬ 
nomials.  The  first  few  are 


H°0  =  1, 


Hi 

Hi 


=  z, 

—  X , 

=  y, 


The  polynomials 


H°2  = 

Hi  =3  xz. 


2  12 
Z  —  ^X 


1  2 
2  2/  > 


z3  —  \  x2z  —  |  y2z1 
Hi  =  6xz2  —  —  Ixy2, 


3 

i 


H\  = 

Hl  = 

Hi  = 


3yz, 

3x2  —  3  y 
6xy , 


Hi 


=  6yz2  -  | x2y  -  |y3, 


Hi  = 
Hl  = 

h!  = 

Hi  = 


15  x2z  —  15  y2z, 
30  xy  z, 

15x3  —  45  xy2, 

45  x2y  —  15  y3. 


Hi  Hi 


TTn  TT 1 

’  £1n’>  nn  ’ 


n 


■  Hn 


(12.50) 


are  homogeneous  of  degree  n.  Orthogonality  of  the  spherical  harmonics  implies  that  they 
form  a  basis  for  the  vector  space  comprised  of  all  homogeneous  harmonic  polynomials  of 
degree  n,  which  hence  has  dimension  2n  +  1. 

The  harmonic  polynomials  (12.49)  form  a  complete  system,  and  therefore  the  gen¬ 
eral  solution  to  the  Laplace  equation  inside  the  unit  ball  can  be  written  as  a  harmonic 
polynomial  series: 


u(x,y,z)  = 


c 


OO 


+  J2  (  Hlx,y,z)+  Yi  [cm,nHl(x,y,z)  +  cm  nHl(x,y,z) 

rn  =  1 

(12.51) 


0,0 

2  ^  \  2 

n  =  1 


or  equivalently,  in  spherical  coordinates. 


c 


OO 


0,0 


+  E 

n  =  1 


C 


n 


0  ,n 


rnY%(<p)  +  [cm,nrnYZ(<p,6)  +  cm,nrnYl(<p,0)] 


m—  1 


(12.52) 


u(r,ip,0) 
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The  coefficients  cmn,cmn  are  uniquely  prescribed  by  the  boundary  conditions.  Indeed, 
substituting  (12.52)  into  the  Dirichlet  boundary  conditions  on  the  unit  sphere  r  —  1  yields 


u(i,ip,  6)  —  +  V 


c 


n 


0 in  'x/'O 


V(^)+E  +  }=h(<p,e). 


n  =  1 


m  =  1 


(12.53) 

Thus,  the  coefficients  cmn,cmn  are  given  by  the  inner  product  formulae  (12.44).  If  the 
terms  in  the  resulting  series  are  uniformly  bounded  —  which  occurs  for  all  piecewise  con¬ 
tinuous  functions  h,  as  well  as  all  L2  functions  and  many  generalized  functions  such  as  the 
delta  function  —  then  the  harmonic  polynomial  series  (12.52)  converges  everywhere,  and, 
in  fact,  uniformly  on  any  smaller  ball  ||x||  =  r  <  r0  <  1. 


Averaging ,  the  Maximum  Principle ,  and  Analyticity 


In  rectangular  coordinates,  the  nth  summand  of  the  series  (12.51)  is  a  homogeneous  polyno¬ 
mial  of  degree  n.  Therefore,  repeating  the  argument  used  in  the  two-dimensional  situation 
(4.115),  we  conclude  that  the  harmonic  polynomial  series  is,  in  fact,  a  power  series,  and 
hence  provides  the  Taylor  expansion  for  the  harmonic  function  u{x ,  y,  z)  at  the  originl  In 
particular,  its  convergence  for  all  r  <  1  implies  that  the  harmonic  function  u(x,y,z)  is 
analytic  at  x  =  y  =  z  =  0. 

The  constant  term  in  such  a  Taylor  series  can  be  identified  with  the  value  of  the 
function  at  the  origin:  u( 0,0,0)  =  ^c0  0.  On  the  other  hand,  since  u  =  h  on  =  <912,  the 
coefficient  formula  (12.45)  tells  us  that 


n(0,  0,  0)  = 


c 


0,0 


4t r 


u  dS. 


(12.54) 


Therefore,  we  have  established  the  three-dimensional  counterpart  of  Theorem  4.8:  the  value 
of  a  harmonic  function  u  at  the  center  of  the  sphere  is  equal  to  the  average  of  its  values 

on  the  sphere’s  surface.  Moreover,  each  partial  derivative  ,  (0,  0,  0)  appears,  up 

oxloy3oz k 

to  a  factor,  as  the  coefficient  of  the  terms  xlyi zk  in  the  Taylor  series,  and  hence  can  be 
expressed  as  a  certain  linear  combination  of  the  coefficients  cm  n,  cm  n,  which  are  in  turn 
given  by  the  integral  formulae  (12.44). 

More  generally,  the  value  of  a  harmonic  function  at  the  center  of  any  ball  contained 
within  its  domain  equals  the  average  of  its  values  over  the  bounding  sphere.  As  with  the 
planar  version  in  Theorem  4.8,  it  is  preferable  to  give  a  direct  proof  that  doesn’t  rely  on 
the  series  expansion  (12.51). 

Theorem  12.4.  If  u(x.)  is  a  harmonic  function  defined  on  a  domain  12  C  M3,  then  u 
is  analytic  inside  12.  Moreover,  its  value  at  any  x(J  E  12  is  obtained  by  averaging  its  values 
on  any  sphere  S'a  =  {||x  —  x0||=a}  centered  at  x0 : 


(12.55) 


provided  the  enclosed  ball  lies  within  its  domain  of  analyticity :  Ba 


<  a}  C  12. 
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Proof :  Let  us  denote  the  average  of  u  over  the  sphere  of  radius  a  by 


- - -  [  [  u  dS 

4vr  a2  J  JSa 

1  (‘^ 

—  /  /  u(x0  +  a  sin  cos9,y0  +  a  sin  99  sin0,  z0  +  acostp)  sincpdcpdd. 

47r  J-7T  Jo 


By  continuity,  as  the  radius  a  — >  0,  the  average  of  u  on  the  sphere  Sa  tends  to  its  value  at 
the  center:  g(a)  — w(x0). 

On  the  other  hand,  since  u  G  C2  and  harmonic  in  C  O,  the  derivative 


^sin(/?  cos# 


du 

dx 


+  sin  sin  9 


du 

dy 


du 

+  COS  (p  — 

dz 


sin  ip  dtp  d6 


=  - - -  [  [  dS  =  - - -  [  [  [  A u  dx  dy  dz  =  0, 

47raiJJSadn  4yra2  J  J  JBa 

where  n  denotes  the  unit  outwards  normal  to  Sa  =  dBal  and  we  used  the  divergence 
identity  in  Exercise  12.1.11(a).  We  conclude  that  g{a)  is  constant,  and  hence  g{a)  =  a(x0) 
for  any  a  >  0  provided  Ba  ch  Q.E.D. 


Arguing  as  in  the  planar  case  of  Theorem  4.9,  we  readily  establish  the  corresponding 
Strong  Maximum  Principle  for  harmonic  functions  of  three  variables. 


Theorem  12.5.  A  nonconstant  harmonic  function  cannot  have  a  local  maximum  or 
minimum  at  any  interior  point  of  its  domain  of  dehnition.  Moreover ,  its  global  maximum 
or  minimum  ( if  any )  is  located  on  the  boundary  of  the  domain. 

For  instance,  the  Maximum  Principle  implies  that  the  maximum  and  minimum  tem¬ 
peratures  in  a  solid  body  in  thermal  equilibrium  are  to  be  found  only  on  its  boundary.  In 
physical  terms,  since  heat  energy  must  flow  away  from  an  internal  maximum  and  towards 
an  internal  minimum,  any  local  temperature  extremum  inside  the  body  would  preclude  it 
from  being  in  thermal  equilibrium.  The  Maximum  Principle  immediately  implies  a  Unique¬ 
ness  Theorem  for  both  the  Laplace  and  Poisson  equations,  cf.  Theorem  4.10,  which  in  turn 
establishes  the  solution  formula  (12.51)  and  hence  analyticity  of  every  harmonic  function. 

Example  12.6.  In  this  example,  we  shall  determine  the  electrostatic  potential  inside 
a  hollow  sphere  when  the  upper  and  lower  hemispheres  are  held  at  different  constant 
potentials.  This  device  is  called  a  spherical  capacitor  and  is  realized  experimentally  by 
separating  the  two  charged  conducting  hemispherical  shells  by  a  thin  insulating  ring  at 
the  equator.  A  straightforward  scaling  argument  allows  us  to  choose  our  units  so  that  the 
sphere  has  unit  radius,  while  the  potential  is  set  equal  to  1  on  the  upper  hemisphere  and 
equal  to  0,  i.e.,  grounded,  on  the  lower  hemisphere.  The  resulting  electrostatic  potential 
satisfies  the  Laplace  equation 


A  u  =  0 


inside  a  solid  ball 


x 


<  1 


(12.56) 


and  is  subject  to  Dirichlet  boundary  conditions 


u(x,y,z)  =  h(x,y,z)  = 


0, 


z  >  0. 
z  <  0. 


on  the  unit  sphere 


x 


=  1 


(12.57) 


The  solution  will  be  prescribed  by  a  harmonic  polynomial  series  (12.51)  whose  coeffi¬ 
cients  are  fixed  by  the  boundary  values  (12.57).  Before  tackling  the  required  computation, 
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let  us  first  note  that  since  the  boundary  data  does  not  depend  upon  the  azimuthal  angle 
0,  the  solution  u  =  u(r,  p)  will  also  be  independent  of  0.  Therefore,  we  need  only  consider 
the  ^-independent  spherical  harmonic  polynomials  (12.38),  which  are  those  with  m  —  0. 
Thus, 


^  OO  1  oo 

u(x,  y,  z)  =  -  CnHn(x,y,z)  =  -  cnrnpn(c°sip) 


(12.58) 


n  =  0 


n  =  0 


where  we  abbreviate  cn  =  c0  .  The  boundary  conditions  (12.57)  require 


u 


r— 


^  oo 

1  =  T  CnPn(  COS(P)  =  KV)  = 


n  =  0 


0  <  p  <  7j  7T, 

b  7T  <  p  <  7T. 


The  coefficients  are  given  by  (12.44),  which,  in  the  case  m  —  0,  reduce  to 


2n  +  1 
27 r 


/r  r71/2  r1 

/  h  Y®  dS  =  (2  77  +  1)  /  Pn(cos<^)  sin<£  dp  =  (2n  +  1)  /  Pn(t)dt, 

(12.59) 

since  h  =  0  when  ^7r  <  ^  <  7r.  The  first  few  are  c0  =  1,  c{  =  |,  c2  =  0,  c3  =  —  |,  c4  =  0. 
Therefore,  the  solution  has  the  explicit  Taylor  expansion 


u(x,y,z) 


\  +  |  r  cos  p  —  r3  cos  p 
21 


35 
128 
3 


r3  cos  3  (/?  + 


=  i  +  |z+i(^+^),__L^  + 


(12.60) 


Note  in  particular  that  the  value  u{ 0,  0,  0)  =  \  at  the  center  of  the  sphere  is  the  average 
of  its  boundary  values,  in  accordance  with  Theorem  12.4.  The  solution  depends  only  on 
the  cylindrical  coordinates  r,  £,  which  is  a  consequence  of  the  invariance  of  the  Laplace 
equation  under  general  rotations,  coupled  with  the  invariance  of  the  boundary  data  under 
rotations  around  the  z-axis. 

Remark :  The  same  solution  u(x,y,z)  describes  the  thermal  equilibrium  in  a  solid 
sphere  whose  upper  hemisphere  is  held  at  temperature  1°  and  lower  hemisphere  at  0°. 

Example  12.7.  A  closely  related  problem  is  to  determine  the  electrostatic  potential 
outside  a  spherical  capacitor.  As  in  the  preceding  example,  we  take  our  capacitor  of  radius 
1,  with  electrostatic  charge  of  1  on  the  upper  hemisphere  and  0  on  the  lower  hemisphere. 
Here,  we  need  to  solve  the  Laplace  equation  A u  =  0  in  the  unbounded  domain  12  = 
{  ||  x  ||  >  1}  —  the  exterior  of  the  unit  sphere  —  subject  to  the  same  Dirichlet  boundary 
conditions  (12.57).  We  anticipate  that  the  potential  will  be  vanishingly  small  at  large 
distances  away  from  the  capacitor:  r  =  ||  x  ||  1.  Therefore,  the  harmonic  polynomial 

solutions  (12.49)  will  not  help  us  solve  this  problem,  since  (except  for  the  constant  case) 
they  become  unboundedly  large  far  away  from  the  origin. 

However,  revisiting  our  original  separation  of  variables  argument  will  produce  a  dif¬ 
ferent  class  of  solutions  having  the  desired  decay  properties.  When  we  solved  the  radial 
equation  (12.47),  we  discarded  the  solution  v2(r)  =  r_n_1  because  it  had  a  singularity  at 
the  origin.  In  the  present  situation,  the  behavior  of  the  function  at  r  =  0  is  irrelevant;  onr 
requirement  is  that  the  solution  decay  as  r  — oo,  and  v2(r)  has  this  property.  Therefore, 
we  will  utilize  the  complementary  harmonic  functions 


Kr(x’Viz)  =  r  2n  1  Hr(x’V’z)  =  r  n  1  =  T 

KZ(x,y,z)  =  r~2n~1  H^ix,  y,  z)  =  r~n~x  0)  =  r 


n- 


n- 


^pffiyp)  COS  7710. 
1p7fi(p)  Sin  777  0, 


(12.61) 
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for  solving  such  exterior  problems.  For  the  capacitor  problem,  we  need  only  those  that  are 
independent  of  0,  whereby  m  —  0.  We  write  the  resulting  solution  as  a  series 


u 


^  OO  1  oo 

(x,y,z)  =  -  cnK°(x,y,z)  =  -  ^  cn  r“n_1  Pn(cosip) 


(12.62) 


n  —  0 


n  —  0 


The  boundary  conditions 


u 


1  o°  f 

■=1  =  2  zl  CnPn(COSlP)  =  Hv)  =  < 
n  —  0  t 


1, 

0, 


0  <  (f>  <  r, 

^7T  <  (/?  <  7T, 


are  identical  to  those  in  the  previous  example.  Therefore,  the  coefficients  are  given  by 
(12.59),  leading  to  the  series  expansion 


u{x,  y,  z)  =  —  + 

2r 


3cos(/?  21  cos  ip  +  35cos3(/? 


4  A 


+ 


128r4 

3z 


+ 


(12.63) 


21  (x2-j-y2)z  —  14z3 
+  A  „  i - rTwTTo  + 


2  y7 x 2  +  y2  +  z2  4(x2  +  y2  +  z2)3/2  32  [x2  +  y2  +  z2)7/2 


Observe  that  the  higher-order  terms  become  negligible  at  large  distances,  and  hence  the 
potential  is  asymptotic  to  that  associated  with  a  point  charge  concentrated  at  the  origin 
of  magnitude  which  is  the  average  of  the  boundary  potential  over  the  sphere.  This  is 
indicative  of  a  general  fact,  to  be  explored  in  Exercise  12.2.32. 


Exercises 

12.2.1.  A  solid  ball  of  radius  R  has  its  upper  hemispherical  surface  held  at  temperature  Tx  and 
its  lower  hemispherical  surface  held  at  temperature  T0.  Find  the  resulting  equilibrium 
temperature. 

12.2.2.  A  solid  ball  has  its  top  hemispherical  surface  insulated  and  its  bottom  hemispherical 
surface  held  at  a  fixed  temperature  of  10°.  Find  its  equilibrium  temperature. 

12.2.3.  Find  the  potential  inside  a  spherical  capacitor  of  radius  R  when  the  upper  hemisphere 
is  at  potential  a  and  the  lower  is  at  f3 . 

12.2.4.  Find  the  potential  u(x ,  y,  z)  inside  a  unit  spherical  capacitor  that  has  the  indicated 

boundary  values  on  the  unit  sphere  x2  +  y2  +  z2  —  1:  (a)  x,  (b)  x2  +  y2,  (c)  x3 . 

Hint :  The  potential  is  a  polynomial. 

12.2.5.  Each  point  on  the  spherical  boundary  of  a  solid  ball  of  radius  1  has  temperature  equal 
to  its  zenith  angle  cp.  ( i )  Find  the  value  of  the  equilibrium  temperature  at  the  center  of  the 
ball.  ( ii )  Find  the  Taylor  polynomial  of  degree  3,  based  at  the  origin,  for  the  equilibrium 
temperature  distribution. 

12.2.6.  Solve  Exercise  12.2.5  when  the  boundary  temperature  equals  (a)  cos(p,  (b)  cos#,  (c)  0. 

12.2.7.  A  solid  spherical  container  of  radius  3  cm  contains  a  hollow  spherical  cavity  of  radius 
1  cm  in  its  center.  The  inner  cavity  is  filled  with  boiling  water  at  100°,  while  the  entire 
container  is  immersed  in  an  ice  water  bath  at  0°.  Assume  that  the  container  is  in  thermal 
equilibrium.  True  or  false:  The  temperature  at  a  point  half-way  between  the  container’s 
inner  and  outer  boundaries  is  50°.  If  true,  explain.  If  false,  what  is  the  temperature  at  such 
a  point? 

12.2.8.  Find  the  electrostatic  potential  between  two  concentric  spherical  metal  shells  of 
respective  radii  1  and  1.2,  given  that  the  inner  shell  is  grounded,  while  the  outer  shell  has 
potential  equal  to  1. 
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0  12.2.9.  Use  the  chain  rule  to  establish  the  formula  (12.16)  for  the  Laplacian  in  spherical 
coordinates. 

^  12.2.10.  (a)  Prove  that  t  =  ±1  are  both  regular  singular  points  for  the  order  0  Legendre  dif¬ 
ferential  equation  (12.28).  (b)  Prove  that  the  Legendre  eigenvalue  problem  (12.27-28)  is 

o 

defined  by  a  self-adjoint  operator  with  respect  to  the  L  inner  product  on  the  cut  locus 
[  —  1, 1].  (c)  Discuss  the  orthogonality  of  the  Legendre  polynomials. 

0  12.2.11.  Solve  Exercise  12.2.10  for  the  Legendre  eigenvalue  problem  (12.26-27)  of  order  m 
along  with  the  relevant  Ferrers  eigenfunctions. 

0  12.2.12.  Suppose  m  >  0.  (a)  Find  the  Green’s  function  for  the  boundary  value  problem 

Y2  ~p  i  ~p  2 

<i "  *  >  dF  "  2 ^  ~  P  =  m’  '  P(-1}  I’  1  F(1)  1  <  “■ 


rn  rn 

2 

and 

(b)  Use  part  (a)  to  prove  completeness  of  the  Ferrers  functions  of  order  m  >  0  on  [—  1, 1  . 

(c)  Explain  why  there  is  no  Green’s  function  in  the  order  m  =  0  case. 

Remark :  When  m  =  0,  one  can  use  the  trick  of  Example  9.49  to  prove  completeness. 
Although  the  Green’s  function  for  the  modified  operator  does  not  have  an  explicit  elemen¬ 
tary  formula,  one  can  prove  that  it  has  logarithmic  singularities  at  the  endpoints,  and  hence 
finite  double  L2  norm.  See  [120;  §43]  for  details. 


Hint :  The  homogeneous  differential  equation  has  solutions 


IT  t 
1  -  t 


12.2.13.  What  happens  when  n  <  m  in  formula  (12.31)? 

^  12.2.14.  Prove  that  the  Legendre  polynomial  (12.29)  has  the  explicit  formula 


,ra 


(2  n  —  2m)! 


p  (t)  =  W  f— IV 

nK  }  <  2n  (n  —  m)  \  m  \  (n  —  2m)\ 

(X2m<n  \  /  \  j 


,n  —  2m 


(12.64) 


<0  12.2.15.  Prove  the  following  recurrence  relation  for  the  Ferrers  functions: 

Pp\t)  =  y/l-t2  d ^  mt 


dPn 

n  + 


dt 


yl  —  t2 


Pn(t)- 


(12.65) 


G  12.2.16.  In  this  exercise,  we  determine  the  L2  norms  of  the  Ferrers  functions,  (a)  First,  prove 
r1  9  22n+1  (n  I  j2 

f_  (i-t2) ndt  =  V — y  > 


that 


2  2 


.  Hint :  Set  t  =  cos^  and  then  integrate  by  parts 

-  .  Hint :  Integrate  by  parts  repeatedly  and  then 
2n  T  1 

use  part  (a),  (c)  Prove  that  ||  PR1^1  ||2  =  (n  —  m)(n  T  m  T  1)  ||  P, 'P1  ||2.  Hint:  Use  (12.65) 


y  (2n  T  1) ! 

2 

repeatedly,  (b)  Prove  that  ||  P  ||  = 


and  an  integration  by  parts,  (d)  Finally,  prove  that  ||  P 


rn 

n 


2  (n  T  m) ! 


2n  T  1  (n  —  m) ! 


12.2.17.  (a)  Prove  that  P™(t)  is  an  even  or  odd  function  according  to  whether  m  T  n  is  an  even 

or  odd  integer,  (b)  Prove  that  its  Fourier  form,  depends  only  on  cos n<p,  cos (n  —  2)<p, 

cos (n  —  4 )cp,  ...  if  m  is  even,  and  only  on  sin  rup,  sin (n  —  2 )(p,  sin (n  —  4 )(p,  ...  if  m  is  odd. 


12.2.18.  Let  m  be  fixed.  Are  the  functions  for  n  —  0, 1,  2, . . .  mutually  orthogonal  with 

respect  to  the  standard  L2  inner  product  on  [ 0,  tt ] ?  If  not,  is  there  an  inner  product  that 
makes  them  orthogonal  functions? 


12.2.19.  Prove  that  the  surfaces  defined  by  the  first  three  spherical  harmonics  V0° ,  Y® ,  and  , 
as  in  Figure  12.5,  are  all  spheres.  Find  their  centers  and  radii. 

0  12.2.20.  Explain  why  the  surface  defined  by  r  =  is  obtained  by  rotating  that  defined 

by  r  =  Vnm((^,  0)  around  the  z— axis  by  90°. 
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<J)  12.2.21.  Prove  directly  that  the  spherical  Laplacian  As  is  a  self-adjoint  linear  operator  with 
respect  to  the  inner  product  (12.40). 

0  12.2.22.  (a)  In  view  of  Exercise  12.2.21,  which  orthogonality  relations  in  (12.41)  follow  from 
their  status  as  eigenfunctions  of  the  spherical  Laplacian? 

(b)  Prove  the  general  orthogonality  formulae  by  direct  computation. 


^  12.2.23.  State  and  prove  the  orthogonality  of  the  complex  spherical  harmonics  (12.46).  Then 
establish  the  following  formula  for  their  norms: 


y. 


rri 

n 


47r(n  +  m) ! 
(2n  +  l)(n  —  m) ! 


n  =  0, 1,  2, . . .  , 
m  =  —  n,  —  n  +  1, . . . ,  n. 


(12.66) 


12.2.24.  Prove  the  formulae  (12.42)  for  the  norms  of  the  spherical  harmonics. 

Hint :  Use  Exercise  12.2.16. 

0  12.2.25.  Justify  the  formulas  in  (12.50)  for  (a)  H®,  (b)  H®,  (c)  H\. 

12.2.26.  Find  formulas  for  the  following  harmonic  polynomials  (x)  in  spherical  coordinates; 

[%%)  in  rectangular  coordinates:  (a)  (b)  Hf,  (c)  H\. 

12.2.27.  Explain  why  every  polynomial  solution  of  the  Laplace  equation  is  a  linear  combination 
of  the  harmonic  polynomials  (12.49).  Hint :  Look  at  its  Taylor  series. 

12.2.28.  (a)  Prove  that  if  u(x,  y ,  z)  is  any  harmonic  polynomial,  then  so  are  u{y ,  x,  z ),  u(z,  x,  y), 
and  all  other  functions  obtained  by  permuting  the  variables  x,y,  z.  (b)  Discuss  the  effect  of 
such  permutations  on  the  basis  harmonic  polynomials  Hff {x ,  y ,  z)  appearing  in  (12.50). 

12.2.29.  Find  the  formulas  in  rectangular  coordinates  for  the  following  complementary  har¬ 
monic  functions:  (a)  Kq,  (b)  K.\ ,  (c)  K 2,  (d)  K\. 

§  12.2.30.  Let  u(x,y,z)  be  a  harmonic  function  defined  on  the  unit  ball  r  <  1.  Prove  that  its 

gradient  at  the  center,  Va(0),  equals  the  average  of  the  vector  field  v(x)  =  xu(x)  over  the 
unit  sphere  r  —  1. 


0  12.2.31.  (a)  Suppose  u(x,y,z)  is  a  solution  to  the  Laplace  equation.  Prove  that  the  function 

U(x,y,z)  =  r_1  a(x/r2,  y/r2,  z/r2)  obtained  by  inversion  is  also  a  solution,  (b)  Explain 
how  inversion  can  be  used  to  solve  boundary  value  problems  on  the  exterior  of  a  sphere, 
(c)  Use  inversion  to  relate  the  solutions  to  Examples  12.6  and  12.7. 


0  12.2.32.  Suppose  u(r,  <p,  0)  is  the  potential  exterior  to  a  spherical  capacitor  of  unit  radius. 

(a)  Prove  that  lim  ru(r,  <p,0)  equals  the  average  value  of  u  on  the  sphere. 

r  — >  00 

(b)  Use  Exercise  12.2.31  to  deduce  this  result  as  a  consequence  of  Theorem  12.4. 


12.2.33.  (a)  Write  out,  using  spherical  coordinates,  formulas  for  the  L2  inner  product  and  norm 
for  scalar  fields  /(r,  (/?,  0)  and  g(r,  0)  on  a  solid  ball  of  unit  radius  centered  at  the  origin. 

r\  r\ 

(b)  Let  /(#,  y,  z)  =  z  and  g{x ,  y,  z)  =  x  +  y  .  Find  ||  /  ||,  ||  g  ||  and  (f,g). 

(c)  Verify  the  Cauchy-Schwarz  and  triangle  inequalities  for  these  two  functions. 


^  12.2.34.  Use  separation  of  variables  to  construct  a  Fourier  series  solution  to  the  Laplace 
equation  on  a  rectangular  box,  B  =  {0  <  x  <  a,  0  <  y  <  b,  0  <  z  <  c},  subject  to  the 

h(x,  y),  z  =  0,  0  <  x  <  a,  0  <  y  <  6, 


Dirichlet  boundary  conditions  u(x,  y ,  z) 


0. 


at  all  other  points  in  dB. 


12.2.35.  Find  the  equilibrium  temperature  distribution  inside  a  unit  cube  that  has  100°  tem¬ 
perature  on  its  top  face,  0°  on  its  bottom  face,  while  all  four  side  faces  are  insulated. 


12.2.36.  Solve  Exercise  12.2.35  when  the  top  face  of  the  cube  has  temperature 

u(x,  y,  1)  =  cos  7 tx  cos  ny. 

£  12.2.37.  A  solid  unit  cube  is  in  thermal  equilibrium  when  subject  to  100°  temperature  on  its 
top  face  and  0°  on  all  other  faces.  True  or  false:  The  temperature  at  the  center  equals  the 
average  temperature  over  the  surface  of  the  cube. 
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12.2.38.  Solve  the  boundary  value  problem 

d2u  d2u  d2u 

+  u  =  cos  x  cos  y. 


u(x,  y,  0)  =  1, 


dx 2 
du 


dy 2  dz 2 


0  <  x,  y,  z  <  7r. 


dz 


(x,  y,  tt)  =  ^  (x,  0,  z)  =  ^  (x,  7 t,  z)  =  ^  (0,  y,  z)  =  ^  (tt,  y,  z)  =  0. 


St/ 


<9x 


12.2.39.  Let  C  be  the  cylinder  of  height  1  and  diameter  1  that  sits  on  the  (x,  y)— plane  centered 

on  the  z-axis.  (a)  Write  out,  in  cylindrical  coordinates,  the  explicit  formula  for  the  L2 
inner  product  and  norm  on  C. 

(b)  Let  f(x,  y,z)  =  z  and  g(x,  y ,  z)  =  x2  +  y2 .  Find  ||  /  ||,  ||  <7 1|  and  (f,g). 

(c)  Verify  the  Cauchy-Schwarz  and  triangle  inequalities  for  these  two  functions. 

^  12.2.40.  (a)  Write  out  the  Laplace  equation  in  cylindrical  coordinates. 

(b)  Use  separation  of  variables  to  construct  a  series  solution  to  the  Laplace  equation  on  the 

cylinder  C  =  {x2  -\-  y2  <  1,  0  <  z  <  1},  subject  to  the  Dirichlet  boundary  conditions 

h(x,y ),  2  =  0,  x2 +  y2  <  1, 

0,  at  all  other  points  in  <9C. 


a(>,y,2) 


12.2.41.  A  cylinder  of  radius  1  and  height  2  has  100  temperature  on  its  top  face,  0  on  its 
bottom  face,  while  its  curved  side  is  fully  insulated.  Find  its  equilibrium  temperature 
distribution. 


12.2.42.  Solve  Exercise  12.2.41  if  the  curved  sides  are  kept  at  0°  instead. 


12.3  Green’s  Functions  for  the  Poisson  Equation 

We  now  turn  to  the  inhomogeneous  form  of  the  three-dimensional  Laplace  equation: 
the  Poisson  equation 

-A  u  =  f,  (12.67) 

on  a  solid  domain  O  C  M3.  In  order  to  uniquely  specify  the  solution,  we  must  impose 
appropriate  boundary  conditions:  Dirichlet  or  mixed.  (As  in  the  planar  version,  Neumann 
boundary  value  problems  have  either  infinitely  many  solutions  or  no  solutions,  depending 
upon  whether  the  Fredholm  conditions  are  satisfied  or  not.)  We  only  need  to  discuss  the 
case  of  homogeneous  boundary  conditions,  since,  by  linear  superposition,  an  inhomogeneous 
boundary  value  problem  can  be  split  into  a  homogeneous  boundary  value  problem  for  the 
inhomogeneous  Poisson  equation  along  with  an  inhomogeneous  boundary  value  problem 
for  the  homogeneous  Laplace  equation. 

As  in  Chapter  6,  we  begin  by  analyzing  the  case  of  a  delta  function  inhomogeneity 
that  is  concentrated  at  a  single  point  in  the  domain.  Thus,  for  each  £  =  (£,r},()  E  O,  the 
Green’s  function  G(x;  £)  =  G(x,  y ,  2;  £,  77,  Q  is  the  unique  solution  to  the  Poisson  equation 

—  Au  =  5(x  —  £)  =  S(x  —  £)  S(y  —  rj)  S(z  —  ()  for  all  xgO,  (12.68) 

subject  to  the  chosen  homogeneous  boundary  conditions.  The  solution  to  the  general 
Poisson  equation  (12.67)  is  then  obtained  by  superposition:  We  write  the  forcing  function 

f(x,y,z)=[[[  f(£,ri,C)8(x-0s(y-rl)ti(z-Q<%dydC  (12.69) 

J  J  Jn 

as  a  linear  superposition  of  delta  functions.  By  linearity,  the  solution 

u(x,y,z)=  /  //  f(€,r},C)G(x,y,z;€,r},C)<%dr}dC 

J  J  Jo. 


(12.70) 
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to  the  homogeneous  boundary  value  problem  for  the  Poisson  equation  (12.67)  is  then  given 
as  the  corresponding  superposition  of  the  Green’s  function  solutions. 

The  Green’s  function  can  also  be  used  to  solve  the  inhomogeneous  Dirichlet  boundary 
value  problem 

—  Au  =  0,  xEf 2,  u  =  /i,  x  £  <912.  (12.71) 

The  same  argument  that  was  used  in  the  two-dimensional  situation  produces  the  solution 

u(x)  =  “  JJ  (x;  0  h(0  dS >  (12.72) 

where  the  normal  derivative  is  taken  with  respect  to  the  variable  £  €=  <912.  In  the  case  that 
12  is  a  solid  ball,  this  integral  formula  effectively  sums  the  spherical  harmonic  series  (12.51); 
see  Theorem  12.12  below. 


The  Free-Space  Green’s  Function 


Only  in  a  few  specific  instances  is  an  explicit  formula  for  the  Green’s  function  known. 
Nevertheless,  certain  general  guiding  features  can  be  readily  established.  The  starting 
point  is  to  investigate  the  Poisson  equation  (12.68)  when  the  domain  12  =  M3  is  all  of 
three-dimensional  space.  We  impose  boundary  constraints  by  seeking  a  solution  that  goes 
to  zero,  n(x)  — )►  0,  at  large  distances,  ||  x  ||  — ^  oo.  Since  the  Laplacian  operator  is  invariant 
under  translations,  we  can,  without  loss  of  generality,  place  our  delta  impulse  at  the  origin, 
and  concentrate  on  solving  the  particular  case 

—  A u  =  5(x)  ,  x  G  M3. 

Since  <5(x)  =  0  for  all  x  ^  0,  the  desired  solution  will,  in  fact,  be  a  solution  to  the 
homogeneous  Laplace  equation 


A u  =  0,  x  7^  0, 

save,  possibly,  for  a  singularity  at  the  origin. 

The  Laplace  equation  models  the  equilibria  of  a  uniform  isotropic  medium,  and  so,  as 
noted  in  Exercise  12.1.7,  is  also  invariant  under  three-dimensional  rotations.  This  suggests 
that,  in  any  radially  symmetric  configuration,  the  solution  should  depend  only  on  the 
distance  r  =  ||  x  ||  from  the  origin.  Referring  to  the  spherical  coordinate  form  (12.16)  of 
the  Laplacian  operator,  if  u  is  a  function  of  r  only,  then  its  derivatives  with  respect  to  the 
angular  coordinates  (p,  9  are  zero,  and  so  u{r)  solves  the  ordinary  differential  equation 


d2u 


2 

+  - 


du 


=  0. 


(12.73) 

dr 2  r  dr 

This  equation  is,  in  effect,  a  first-order  linear  ordinary  differential  equation  for  v  =  du/dr 
and  hence  is  particularly  easy  to  solve: 


du 


u{r )  =  — 


and  hence 


u{r) 


b 

CL  - b  — 

r 


dr  r 

where  a,  b  are  arbitrary  constants.  The  constant  solution  u{r)  =  a  does  not  die  away  at 
large  distances,  nor  does  it  have  a  singularity  at  the  origin.  Therefore,  if  our  intuition  is 
valid,  the  desired  solution  should  be  of  the  form 


b 


u  =  -  = 
r 


x 


(12.74) 
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Indeed,  this  function  is  harmonic  —  solves  Laplace’s  equation  —  everywhere  away  from 
the  origin  and  has  a  singularity  at  x  =  0. 

The  solution  (12.74)  is,  up  to  a  constant  multiple,  the  three-dimensional  Newtonian 
gravitational  potential  due  to  a  point  mass  at  the  origin.  Its  gradient, 


f 


(12.75) 


defines  the  gravitational  force  vector  at  the  point  x.  When  b  >  0,  the  force  f(x)  points 
toward  the  mass  at  the  origin.  Its  magnitude 


b  b 


is  proportional  to  the  reciprocal  of  the  squared  distance,  which  is  the  well-known  inverse 
square  law  of  three-dimensional  Newtonian  gravity.  Formula  (12.75)  can  also  be  interpreted 
as  the  electrostatic  force  due  to  a  concentrated  electric  charge  at  the  origin,  with  (12.74) 
giving  the  corresponding  Coulomb  potential.  The  constant  b  is  positive  when  the  charges 
are  of  opposite  signs,  leading  to  an  attractive  force,  and  negative  in  the  repulsive  case  of 
like  charges. 

Returning  to  our  problem,  the  remaining  task  is  to  fix  the  multiple  b  such  that  the 
Laplacian  of  our  candidate  solution  (12.74)  has  a  delta  function  singularity  at  the  origin; 
equivalently,  we  must  determine  a  =  1/b  such  that 

—  A(r_1)  =  a  5(x).  (12.76) 

This  equation  is  certainly  valid  away  from  the  origin,  since  5(x)  =  0  when  x  /  0.  To 
investigate  near  the  singularity,  we  integrate  both  sides  of  (12.76)  over  a  small  solid  ball 
£e  =  {  II  X  II  <  e  }  of  radius  e : 


//  /  A (r  x)  dx  dy  dz  =  /  //  a  5{x)  dx  dy  dz  =  a, 

/  J b£  J  J  Jb£ 


(12.77) 


where  we  used  the  definition  of  the  delta  function  to  evaluate  the  right-hand  side.  On  the 
other  hand,  since  Ar_1  =  V  •  V  r-1,  we  can  use  the  divergence  theorem  (12.8)  to  evaluate 
the  left-hand  integral,  whence 


B, 


A(r~1)dxdydz  =  JjJ^  V  V(r~l)dxdydz  =  j  J  ±  (i 


dS, 


where  the  surface  integral  is  over  the  bounding  sphere  S£  =  dBs  =  {  ||  x  ||  =e}.  The 
sphere’s  unit  normal  n  points  in  the  radial  direction,  and  hence  the  normal  derivative 
coincides  with  differentiation  with  respect  to  r;  in  particular, 


d  (\ 


d  ( 1 


dn  \ 


r 


dr  V  r 


1 


ry*  a 


The  surface  integral  can  now  be  explicitly  evaluated: 


d_ 

s  dn  \  r 


-)dS  =  - 


—  dS  =  — 

S*  T  J  JS£ 


dS  =  —  4tt, 


since  S£  has  surface  area  4tt£2.  Substituting  this  result  back  into  (12.77),  we  conclude  that 


a  =  4tt. 


—  A  r  1  =  4  t r5(x) 


(12.78) 


and  hence 
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This  is  our  desired  formula!  We  conclude  that  a  solution  to  the  Poisson  equation  with  a 
delta  function  impulse  at  the  origin  is 

1  1  1 


G(x,  y,  z)  = 


4xr 


47r||x||  47t  a/x2  +  y2  +  z2 


(12.79) 


which  is  the  three-dimensional  Newtonian  potential  due  to  a  unit  point  mass  situated  at 
the  origin. 

If  the  singularity  is  concentrated  at  some  other  point  £  =  (£>?7>C)>  then  we  merely 
translate  the  preceding  solution.  This  leads  immediately  to  the  free-space  Green’s  function 


G(x;£)  =  G(x-£) 


1 


47T 


X 


47 r  y/(x  -  02  +  {y-  rj)2  +  (z  -  C)2 


(12.80) 


The  superposition  principle  (12.70)  implies  the  following  integral  formula  for  the  solutions 
to  the  Poisson  equation  on  all  of  three-dimensional  space. 


Theorem  12.8.  Assuming  that  /(x)  — 0  sufficiently  rapidly  as 
ticular  solution  to  the  Poisson  equation 


x 


—  A u  =  /,  for  x  E  M3 


oo,  a  par- 


(12.81) 


is  given  hy 


«*(x) 


1 


m 


47T 


R3 


X 


dti 


f(Cv,  C)  d^drjdC 


47t  j  j  jr s  yp  -  O2  +  {y-  ??)2  +  {z-  c)2 


(12.82) 


The  general  solution  is  u(x,  y,  z)  —  y,  z)  +  w(x,  y ,  z),  where  w(x ,  y,  z)  is  an  arbitrary 
harmonic  function. 

Example  12.9.  In  this  example,  we  compute  the  gravitational  (or  electrostatic) 
potential  in  three-dimensional  space  due  to  a  uniform  solid  ball,  e.g.,  a  spherical  planet 
such  as  the  Earth.  By  rescaling,  it  suffices  to  consider  the  case  in  which  the  forcing  function 
is  equal  to  1  inside  a  ball  of  radius  1  and  zero  outside: 


/(x) 


f 

X 

1  o, 

X 

<  1 
>  1 


The  particular  solution  to  the  resulting  Poisson  equation  (12.81)  is  given  by  the  integral 


u(x)  = 


47 r 


£ll<i 


x 


dt;  dy  d(. 


(12.83) 


Clearly,  since  the  forcing  function  is  radially  symmetric,  the  solution  u  =  u{r)  is  also 
radially  symmetric.  To  evaluate  the  integral,  then,  we  can  take  x  =  (0,  0,  z)  to  lie  on  the 

.  We  use  cylindrical  coordinates  £  =  (pcos0,  psin#,  £),  so 


2- axis,  so  that  r  = 
that 


x 


£ 


X 


-  £  II  =  vV  +  (z  -  C)2 
The  integral  in  (12.83)  can  then  be  explicitly  computed: 

1  rV !-C2  /*2tt 

1  Jo 


p  d6  dp  df 


4i r 


o  vV  +  (z  ~  C)2 

1  ^ 

2 


1 


J  (  \/\Tz2  -  2z(  -  \z-C\)dC=< 


z 

3 

5 

1 

^  2 


6 


z 


>  1 


<  1 
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u(r) 


Therefore,  by  radial  symmetry,  the  solution  is 


1 

3 r  ’ 

1  r2 

2  _  6"  ’ 


(12.84) 


plotted,  as  a  function  of  r  —  ||  x  ||,  in  Figure  12.6.  Note  that,  outside  the  solid  ball,  the 
solution  is  a  Newtonian  potential  corresponding  to  a  concentrated  point  mass  of  magnitude 
1 7 r  —  the  total  mass  of  the  planet.  We  have  thus  demonstrated  a  well-known  result  in 
gravitation  and  electrostatics:  the  exterior  potential  due  to  a  spherically  symmetric  mass 
(or  electrically  charged  body)  is  the  same  as  if  all  the  mass  (charge)  were  concentrated  at 
its  center.  In  the  darkness  of  outer  space,  if  you  cannot  see  a  spherical  planet,  you  can 
determine  only  its  mass,  not  its  size,  by  measuring  its  external  gravitational  force. 


Bounded  Domains  and  the  Method  of  Images 


Suppose  we  now  wish  to  solve  the  inhomogeneous  Poisson  equation  (12.67)  on  a  bounded 
domain  Od3.  To  construct  the  desired  Green’s  function,  we  proceed  as  follows.  The 
Newtonian  potential  (12.80)  is  a  particular  solution  to  the  underlying  inhomogeneous  equa¬ 
tion 

—  A'u  =  h(x  —  £),  xGf 2,  (12.85) 

but  it  almost  surely  does  not  have  the  proper  boundary  values  on  <90.  By  linearity,  the 
general  solution  to  such  an  inhomogeneous  linear  equation  must  take  the  form 


1 


4tt 


x-£ 


(12.86) 


where  the  first  term  is  a  particular  solution,  while  u(x)  is  an  arbitrary  solution  to  the  ho¬ 
mogeneous  equation  Av  =  0,  i.e.,  an  arbitrary  harmonic  function.  The  solution  (12.86)  sat¬ 
isfies  the  homogeneous  boundary  conditions,  provided  the  boundary  values  of  u(x)  match 
those  of  the  Green’s  function.  Let  us  explicitly  state  the  result  in  the  Dirichlet  case. 


Theorem  12.10.  The  Green’s  function  for  the  homogeneous  Dirichlet  boundary 
value  problem 


x 


e  <90 , 


A u  =  f  for  x  G  O, 


u  =  0  for 
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V 


Figure  12.7.  Method  of  Images  for  the  unit  ball. 


for  the  Poisson  equation  in  a  domain  Q  C  M3  has  the  form 


G(x;0  = 


-  u(x;£) 


4tt  X  —  £  | 

where  v(x;£)  is  the  harmonic  function  of  x  E  0  that  satishes 

1 


(12.87) 


t>(x;0  = 


47T 


for  all 


x  E  Of}. 


x 


(12.88) 


In  this  manner,  we  have  reduced  the  determination  of  the  Green’s  function  to  the 
solution  to  a  particular  family  of  Laplace  boundary  value  problems,  which  are  parametrized 
by  the  point  £  E  0.  In  certain  domains  with  simple  geometry,  the  Method  of  Images  can 
be  used  to  produce  an  explicit  formula  for  the  Green’s  function.  As  in  Section  6.3,  the  idea 
is  to  match  the  boundary  values  of  the  free-space  Green’s  function  due  to  a  delta  impulse 
at  a  point  inside  the  domain  with  one  or  more  additional  Green’s  functions  corresponding 
to  impulses  at  points  outside  the  domain  —  the  “image  points” . 

The  case  of  a  solid  ball  of  radius  1  with  Dirichlet  boundary  conditions  is  the  easiest  to 
handle.  Indeed,  the  same  geometric  construction  that  we  used  for  a  planar  disk,  redrawn 
in  Figure  12.7,  applies  here.  Although  identical  to  Figure  6.13,  we  are  re-interpreting  it  as 
a  three-dimensional  diagram,  with  the  circle  representing  the  unit  sphere,  while  the  lines 
remain  lines.  The  required  image  point  is  given  by  inversion : 


whereby  ||  £ 


By  the  similar  triangles  argument  used  before,  we  have 


1 
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Ik  II  

X 

x-||| 

X 

V 

x  —  77 

As  a  result,  the  function 


and  therefore 


1 

V 

1 

^1 

47 r 

x  —  77 
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qx,e  = 


has  the  same  boundary  values  on  the  unit  sphere  as  the  Newtonian  potential: 
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1  V 

1 
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whenever 


=  1. 
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We  conclude  that  their  difference 


G(x;£) 


1  1 

(  1 

Cl 

47T 

V 

x-CI 

1C- 

1C 

2  X 

(12.89) 


has  the  required  properties  of  the  Green’s  function:  it  satisfies  the  Laplace  equation  inside 
the  unit  ball  except  at  the  delta  function  singularity  x  =  and,  moreover,  G(x;  £)  =  0 
has  homogeneous  Dirichlet  conditions  on  the  spherical  boundary  ||  x  ||  =  1. 

With  the  Green’s  function  in  hand,  we  can  apply  the  general  superposition  for¬ 
mula  (12.70)  to  arrive  at  a  solution  to  the  Dirichlet  boundary  value  problem  for  the  Poisson 
equation  in  the  unit  ball. 

Theorem  12.11.  The  solution  to  the  Dirichlet  boundary  value  problem 


-Au  =  f 


for 


x 


<  1 


u  =  0 


for 


x 


=  1, 


on  the  unit  ball  is  given  by  the  integral 
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(12.90) 


By  the  same  token,  formula  (12.72)  provides  a  solution  to  the  inhomogeneous  Dirichlet 
boundary  value  problem  for  the  Laplace  equation  on  a  ball. 

Theorem  12.12.  The  solution  to  the  Dirichlet  boundary  value  problem 


x 


<  1 


u  —  h 


—  Au  =  0  for 
on  the  unit  ball  is  given  by  the  following  surface  integral : 
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for 
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u(x.)  = 
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(12.91) 


Proof :  We  start  with  the  explicit  formula  (12.89)  for  the  Green’s  function  on  the 
unit  ball.  Since  the  normal  derivative  on  the  unit  sphere  ||  £  ||  =  1  can  be  written  as 
d/dn  =  £  •  V^,  a  short  computation  demonstrates  that 
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The  solution  formula  (12.91)  thus  immediately  follows  from  (12.72) 


Q.E.D. 


For  example,  the  series  solution  (12.60)  to  the  spherical  capacitor  problem  of  Exam¬ 
ple  12.6  can  thus  be  re-expressed  as  a  surface  integral: 


u(x,y,z )  = 


(1  —  x2  —  y2  —  z 2)  dS 


4t r 


*7 r 


(C2+^2+C2=i,  C>o }  [  (£  —  x )2  +  (rj  —  y )2  +  (C  ~  z)2 
47r/2  (1  —  x2  —  y2  —  z2)  sin  ip  dip  d6 


3/2 


-7T 


(cos  6  sin  ip  —  x)2  +  (sin  6  sin  p  —  y)2  +  (cos  <p  —  z): 


1  3/2 
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Exercises 

12.3.1.  Find  the  equilibrium  temperature  of  a  sphere  of  radius  1  whose  boundary  is  held  at  0° 
while  a  concentrated  unit  heat  source  is  applied  at  (a)  the  center;  (b)  a  point  half-way  be¬ 
tween  the  center  and  the  boundary. 

12.3.2.  A  hot  soldering  iron  is  continually  applied  to  the  north  pole  of  a  solid  spherical  ball  of 
radius  1.  Find  the  equilibrium  temperature. 

12.3.3.  Write  down  the  gravitional  potential  —  both  external  and  internal  —  due  to  a  spherical 
planet  of  radius  R  composed  out  of  a  uniform  material  with  density  p. 

12.3.4.  (a)  Find  the  gravitational  potential  due  to  a  spherical  shell  of  unit  density  obtained  by 
carving  out  a  spherical  cavity  of  radius  a  from  a  solid  ball  of  radius  b  >  a.  Hint :  Use  the 
solution  to  Exercise  12.3.3.  (b)  What  is  the  gravitational  force  inside  the  cavity? 

(c)  Show  that  outside  the  shell,  the  gravitational  potential  is  as  if  the  entire  mass  were 
concentrated  at  the  origin. 

£  12.3.5.  (a)  Write  down  an  integral  formula  for  the  gravitational  potential  and  gravitational 
force  field  due  to  a  mass  of  unit  density  in  the  shape  of  a  solid  unit  cube  that  is  centered 
at  the  origin,  (b)  Use  numerical  integration  to  determine  the  gravitational  force  vector  at 

the  points  (3,  0,  0)  and  (a/3,  a/3,  a/3  ) .  Before  doing  the  calculation,  see  whether  you  can 
predict  which  experiences  a  stronger  force,  and  then  check  your  prediction  numerically. 

(c)  Suppose  the  mass  is  re-formed  into  a  sphere.  How  does  this  affect  the  gravitational 
force  at  the  two  points?  First  predict  whether  it  will  increase,  decrease,  or  stay  the  same. 
Then  test  your  prediction  by  computing  the  values  and  comparing  with  those  you  computed 
in  part  (b). 

12.3.6.  A  thin  hollow  metal  sphere  of  unit  radius  is  grounded.  Find  the  electrostatic  potential 
inside  the  sphere  due  to  a  small  solid  metal  ball  of  radius  p  <  1  placed  at  its  center,  assum¬ 
ing  unit  charge  density  throughout  the  ball. 

12.3.7.  A  thin  straight  rod  of  unit  density  and  length  2£  is  fixed  on  the  z— axis  centered  at  the 
origin.  Find  the  induced  (a)  gravitational  potential  and  (b)  gravitational  force  experienced 
by  a  point  (x,  y:  z)  not  on  the  rod. 

T  12.3.8.  (a)  Find  the  gravitational  force  due  to  a  thin,  uniform  straight  rod  of  unit  density  and 
infinite  length  by  letting  i  — >>  oo  in  your  solution  to  Exercise  12.3.7(b).  (b)  Show  that  the 

force  field  of  part  (a)  has  a  potential  function  that  can  be  identified  with  the  two-dimensional 
logarithmic  gravitational  potential  due  to  a  point  mass  at  the  origin.  Thus,  two-dimensional 
gravitation  can  be  regarded  as  a  cross-section  of  three-dimensional  gravitation  due  to 
infinitely  long  vertical  line  masses,  (c)  Is  your  potential  function  the  limit,  as  t  oo,  of 
the  potential  function  you  found  in  Exercise  12.3.7(a)?  Discuss. 

12.3.9.  Which  well-known  solutions  to  the  Laplace  equation  comes  from  setting  m  —  n  =  0  in 
(12.61)? 

12.3.10.  Use  the  Fredholm  Alternative  to  analyze  the  existence  and  uniqueness  of  solutions  to 
the  homogeneous  Neumann  boundary  value  problem  for  the  Poisson  equation  on  a  bounded 
domain  D  C  R3. 

^  12.3.11.  Mimic  the  proof  of  Theorem  6.19  to  establish  the  solution  formula  (12.72). 

12.3.12.  Use  the  Method  of  Images  to  find  the  Green’s  function  for  a  solid  hemisphere  of  unit 
radius  subject  to  homogeneous  Dirichlet  boundary  conditions. 
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Thermal  diffusion  in  a  uniform  isotropic  solid  body  0  C  IR3  is  modeled  by  the  three- 
dimensional  heat  equation 

d2u  \ 

+  q^2  )>  {x,y,z)eQ.  (12.92) 

The  positivity  of  the  body’s  thermal  diffusivity,  7  >  0,  is  required  on  both  physical  and 
mathematical  grounds.  The  physical  derivation  is  exactly  the  same  as  that  for  the  two- 
dimensional  version  (11.1),  and  does  not  need  to  be  repeated  in  detail.  Briefly,  Fourier’s 
law  expresses  the  heat  flux  vector  as  a  multiple  of  the  temperature  gradient,  w  =  —  n  Vn, 
while  energy  conservation  implies  that  its  divergence  is  proportional  to  the  rate  of  change  of 
temperature:  V  •  w  =  —  ouv  Combining  these  two  physical  laws  and  assuming  uniformity, 
whereby  k  and  a  are  constant,  produces  (12.92)  with  7  =  n/a. 

As  always,  we  must  impose  suitable  boundary  conditions:  either  Dirichlet  condi¬ 
tions  u  =  h  that  specify  the  boundary  temperature;  (homogeneous)  Neumann  conditions 
du/d n  =  0  corresponding  to  an  insulated  boundary;  or  a  mixture  of  the  two.  Given  the 
body’s  temperature 

u(t0,x,  y,  z)  =  f(x,  y,  z)  (12.93) 

at  an  initial  time  £0,  it  can  be  proved,  [38,  61,  99],  that  the  resulting  initial-boundary  value 
problem  is  well-posed,  which  means  that  there  is  a  unique  classical  solution  u(t,  x,  y,  z), 
defined  at  all  subsequent  times  t  >t0,  that  depends  continuously  on  the  initial  data. 

As  in  the  one-  and  two-dimensional  versions,  we  begin  by  restricting  our  attention  to 
homogeneous  boundary  conditions.  Separation  of  variables  works  as  usual,  and  we  quickly 
review  the  basic  ideas.  One  begins  by  imposing  an  exponential  solution  ansatz 

u(t,  x)  =  e~xt  u(x). 

Substituting  into  the  differential  equation  and  canceling  the  exponentials,  it  follows  that  v 
satisfies  the  Helmholtz  eigenvalue  problem 

7  Av  +  A  v  =  0, 

subject  to  the  relevant  boundary  conditions.  For  Dirichlet  and  mixed  boundary  conditions, 
the  Laplacian  is  a  positive  definite  operator,  and  hence  the  eigenvalues  are  all  strictly 
positive, 

0  <  Ax  <  A2  <  •  •  •  ,  with  An  — >  oo,  as  n  — >>  oo. 

Moreover,  on  a  bounded  domain,  the  Helmholtz  eigenfunctions  are  complete,  and  so  linear 
superposition  implies  that  the  solution  can  be  written  as  an  eigenfunction  series 

CX) 

u(t,x)  =  Yi  cne~Xntvn{x).  (12.94) 

n—  1 

The  coefficients  cn  are  uniquely  prescribed  by  the  initial  condition  (12.93): 

oo 

u(t0,x)=Yi  cne~Xnt°  vn(x)  =  /(x).  (12.95) 

n—l 


du  (  d2u  d2u 

—  =  7Aw  =  7^  —  +  — 
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Self-adjointness  of  the  boundary  value  problem  implies  orthogonality  of  the  eigenfunctions, 
and  hence  the  coefficients  are  obtained  via  the  usual  inner  product  formulae: 

(x)  vn(x.)  dx  dy  dz 

- .  (12.96) 

vn(x.)2  dx  dy  dz 

The  resulting  solution  decays  exponentially  fast  to  thermal  equilibrium,  u(t,x.)  — )►  0 
as  t  -7  oo,  typically  at  a  rate  equal  to  the  smallest  positive  eigenvalue  Ax  >  o,  although 
special  solutions,  whose  initial  series  coefficients  vanish,  will  decay  at  a  faster  rate  governed 
by  a  higher  eigenvalue.  Since  the  higher  modes  —  the  terms  with  n  0  —  go  to  zero 
extremely  rapidly  with  increasing  t,  the  solution  can  be  well  approximated  by  the  first  few 
terms  in  its  eigenfunction  expansion.  As  a  consequence,  the  heat  equation  rapidly  smooths 
out  discontinuities  and  eliminates  high-frequency  noise  in  the  initial  data. 

Unfortunately,  explicit  formulas  for  the  eigenfunctions  and  eigenvalues  are  rare.  Most 
explicit  eigensolntions  of  the  Helmholtz  boundary  value  problem  require  a  further  separa¬ 
tion  of  variables.  In  a  rectangular  box,  one  separates  the  solution  into  a  product  of  functions 
depending  upon  the  individual  Cartesian  coordinates,  and  the  eigenfunctions  are  written 
as  products  of  trigonometric  functions;  see  Exercise  12.4.1  for  details.  In  a  cylindrical 
domain,  the  separation  is  effected  in  cylindrical  coordinates,  which  leads  to  eigensolutions 
involving  trigonometric  and  Bessel  functions,  as  outlined  in  Exercise  12.4.5.  The  most  in¬ 
teresting  and  enlightening  case  is  a  spherical  domain,  and  we  treat  this  particular  problem 
in  complete  detail  in  the  ensuing  subsection. 


Exercises 

0  12.4.1.  Let  B  =  {0  <  x  <  a,  0  <  y  <  6,  0  <  z  <  c}  be  a  solid  box  of  size  a  x  b  x  c. 

(a)  Write  down  an  initial-boundary  value  problem  for  the  thermodynamics  of  the  box  when 
all  its  sides  are  all  held  at  0°  and  its  initial  temperature  is  f(x,y,z).  (b)  Use  separation 

of  variables  to  construct  the  normal  mode  solutions,  (c)  Write  down  a  series  representing 
the  general  solution  to  the  initial-boundary  value  problem.  What  are  the  formulas  for  the 
coefficients  in  your  series?  (d)  What  is  the  equilibrium  temperature?  How  fast  does  the 
temperature  in  the  box  decay  to  equilibrium? 

12.4.2.  True  or  false:  In  the  context  of  Exercise  12.4.1,  among  all  boxes  of  a  given  volume  V,  a 
cube  decays  slowest  to  thermal  equilibrium.  What  is  the  cube’s  decay  rate? 

12.4.3.  Answer  Exercises  12.4.1  and  12.4.2  when  the  top  of  the  box,  where  z  =  c,  is  insulated. 

12.4.4.  A  rectangular  brick  of  size  1  cm  x  2  cm  x  3  cm  made  out  of  material  with  diffusion 
coefficient  7  =  6  is  insulated  on  five  sides,  while  one  of  its  small  ends  is  held  at  temperature 
u(x,y,0)  =  cos 7 tx  cos  27 ry.  (a)  Find  the  eventual  equilibrium  temperature  distribution. 

(b)  If  the  brick  is  initially  heated  in  an  oven,  how  fast  does  it  return  to  equilibrium? 

0  12.4.5.  Let  C  =  {  0  <  \Jx2  +  y2  <  a,  0<z<hjbea  solid  cylinder  of  radius  a  and  height  h. 

(a)  Write  down  an  initial-boundary  value  problem  in  cylindrical  coordinates  for  the  thermo¬ 
dynamics  of  the  cylinder  when  its  sides,  top,  and  bottom  are  all  held  at  0°. 

(b)  Use  separation  of  variables  to  write  down  a  series  representing  the  general  solution  to 
the  initial-boundary  value  problem.  What  are  the  formulas  for  the  coefficients  in  your 
series? 
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(c)  What  is  the  eventual  equilibrium  temperature? 

(d)  How  fast  does  the  temperature  in  the  cylinder  go  to  equilibrium? 

12.4.6.  Find  the  solution  to  the  initial-boundary  value  problem  in  Exercise  12.4.5  when  the 
initial  temperature  of  the  cylinder  is  uniformly  30°.  Hint :  Use  (11.112)  to  evaluate  the 
coefficients. 


T  12.4.7.  A  cylindrical  can  that  contains  355  ml  of  soda  is  removed  from  the  refrigerator.  Find 
the  optimal  cylindrical  shape  for  such  a  can  in  order  to  keep  the  soda  cold  the  longest. 

Is  this  the  manufactured  shape  of  a  standard  soda  can  in  your  country? 

T  12.4.8.  True  or  false:  Among  all  solid  cylinders  of  a  given  volume,  the  one  that  reaches  thermal 
equilibrium  the  slowest,  when  subject  to  homogeneous  Dirichlet  boundary  conditions,  is  the 
one  that  has  the  least  surface  area.  Justify  your  answer. 


T  12.4.9.  Among  all  fully  insulated  solid  cylinders  of  unit  volume,  which  cools  down 

( i )  the  slowest?  ( ii )  the  fastest? 

0  12.4.10.  Write  down  a  series  for  the  solution  to  the  homogeneous  Neumann  boundary  value 

o 

problem  for  the  heat  equation  on  a  bounded  domain  D  C  R  ,  corresponding  to  the  thermo¬ 
dynamics  of  a  completely  insulated  solid  body.  What  is  the  equilibrium  temperature  of  the 
body?  Does  the  solution  decay  to  equilibrium?  If  so,  how  fast? 


0  12.4.11.  Suppose  uft,  x ,  y,  z)  is  a  solution  to  the  heat  equation  on  a  fully  insulated  bounded 

o 

domain  D  Cl  .  Use  the  identities  in  Exercise  12.1.11  to  prove  the  following: 


(a)  The  total  heat  H(t)  = 


n 


u(t,  x,y,  z)  dx  dy  dz  is  conserved,  i.e. ,  is  constant.  Explain 


how  this  can  be  used  to  determine  the  equilibrium  temperature  of  the  body. 


n 


u(t ,  x ,  y,  z)  dx  dy  dz 


(b)  If  u  is  a  non-equilibrium  solution,  its  squared  L2  norm  Eft)  = 
is  a  strictly  decreasing  function  of  t. 

(c)  Use  part  (b)  to  prove  uniqueness  of  solutions  to  the  initial  value  problem. 

^  12.4.12.  State  and  prove  a  Maximum  Principle  for  the  three-dimensional  heat  equation. 


Heating  of  a  Ball 


Our  goal  is  to  study  heat  propagation  in  a  solid  spherical  body,  e.g.,  the  Earth.'*'  For 
simplicity,  we  take  the  diffusivity  7  =  1,  and  consider  the  heat  equation  on  a  solid  spherical 
ball  of  unit  radius,  Bx  —  {  ||  x  ||  <  1 },  that  is  subject  to  homogeneous  Dirichlet  boundary 
conditions.  Once  we  know  how  to  solve  this  particular  case,  an  easy  scaling  argument,  as 
outlined  in  Exercise  12.4.16,  will  allow  ns  to  find  the  solution  for  a  ball  of  arbitrary  radius 
and  general  diffusivity. 

As  usual,  when  dealing  with  a  spherical  geometry,  we  adopt  spherical  coordinates 
r,  (/?,  0,  as  in  (12.15),  in  terms  of  which  the  heat  equation  takes  the  form 


du  A  d2u  2  du  1  d2u  cose?  du 

=  Au=  —  +  -  —  +  —  — —  +  0  .  —  + 


d 2 


u 


(12.97) 


dt  dr 2  r  dr  r2  dip2  r2  sin  ip  dip  r2  sin2  ip  d62 

where  we  have  used  our  handy  spherical  coordinate  formula  (12.16)  for  the  Laplacian.  The 


In  this  admittedly  simplistic  model,  we  are  assuming  that  the  Earth  is  composed  of  a 
completely  uniform  and  isotropic  solid  material. 
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standard  diffusive  separation  of  variables  ansatz 

u(t,  r,  p,  9)  =  e~xt  v{r,  p,  9) 

requires  us  to  analyze  the  spherical  coordinate  form  of  the  Helmholtz  equation 

1  d2v 


.  d2v  2  dv  1  d2v  cos  p  dv 

Av  +  \v=  —  +  -T-  +  —  —  +  0  .  —  + 

ry*  CyT* 


2  •  2 
+  sm  p 


39 2 


+  A  -u  =  0 


(12.98) 


r2  dp2  r 2  sin  ^  <9(/? 

on  the  unit  ball  12  =  {r  <  1 }  under  homogeneous  Dirichlet  boundary  conditions.  To  make 
further  progress,  we  invoke  a  second  variable  separation,  splitting  off  the  radial  coordinate 
by  setting 

v(r,  p,  9)  =  p(r)  w(p,  9). 

The  function  w  must  be  27r-periodic  in  9  and  well  defined  on  the  z-axis,  i.e.,  when  p  =  0,  n. 
Substituting  this  ansatz  into  (12.98),  and  separating  all  the  r-dependent  terms  from  those 
terms  depending  on  the  angular  variables  p,  9  leads  to  a  pair  of  differential  equations 
involving  a  separation  constant,  denoted  by  p.  The  first  is  an  ordinary  differential  equation 

d2p 


r 


+  2  r  +  (A  r2  —  p)p  =  0. 


(12.99) 


dr 2  '  dr 

for  the  radial  component  p(r),  while  the  second  is  a  familiar  partial  differential  equation 


An  W  +  p  W  = 


d2w  cos  p  dw 


1  d2w 


+  pw  =  0, 


(12.100) 


dp 2  '  sin  p  dp  '  sin2  p  d92 

for  its  angular  counterpart  w(p,9).  The  operator  As  is  the  spherical  Laplacian  from 
(12.19).  In  Section  12.2,  we  showed  that  its  eigenvalues  are 


for 


m  =  0, 1,  2,  3, ... . 


=  m(m+  1) 

The  mth  eigenvalue  admits  2  777+  1  linearly  independent  eigenfunctions:  the  spherical  har¬ 
monics  Y^, . . . ,  Y^1,  Y^, . . . ,  Y™  defined  in  (12.38). 


Spherical  Bessel  Functions 

The  radial  ordinary  differential  equation  (12.99)  can  be  solved  by  setting 

q(r)  =  y/r  p(r). 

We  use  the  product  rule  to  relate  the  derivatives  of  q  and  p,  whereby 

dp  1  dq  q  d2p  1  d2q  1  dq 


(12.101) 


P 


q 


+ 


3  q 


r1/2  7  dr  7-1/2  pj.  2  7*3/2  ’  dr2  7*1/2  £+2  7*3/2  q  7.5/2 

Substituting  these  expressions  back  into  (12.99)  with  p  =  pm  =  771(771  +  1)  and  multiplying 
the  resulting  equation  by  yjr,  we  discover  that  q(r)  must  solve  the  differential  equation 

d2q 


r 


+  r^+  Ar2—  (tti+|)2  q  —  0 


(12.102) 


dr2  '  dr 

which  we  recognize  as  the  rescaled  Bessel  equation  (11.56)  of  half-integer  order  m  +  | . 
Consequently,  the  solution  to  (12.102)  that  remains  bounded  at  r  =  0  is  (up  to  a  scalar 
multiple)  the  rescaled  Bessel  function 

q(r)  =  Jm+ 1/2  (+0- 
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The  corresponding  solution 

P(r)  =  r~1/2  Jm+1/2  (VX  r )  (12.103) 

to  (12.99)  is  important  enough  to  warrant  a  special  name. 

Definition  12.13.  The  spherical  Bessel  function  of  order  m  >  0  is  defined  by  the 
formula  _ 

Sm(X)  —  \j Jm+ 1  /2  (^)  *  (12.104) 

Remark :  The  multiplicative  factor  x/nj 2  is  included  in  the  definition  so  as  to  avoid 
annoying  factors  of  and  y/2  in  the  subsequent  formulas. 

Surprisingly,  unlike  the  Bessel  functions  of  integer  order,  the  spherical  Bessel  functions 
are  all  elementary  functions!  Comparing  (12.104)  with  (11.105),  we  see  that  the  spherical 
Bessel  function  of  order  0  is 

S0(*)  =  ^.  (12.105) 

The  corresponding  explicit  formulas  for  the  higher-order  spherical  Bessel  functions  can  be 
obtained  through  the  general  recurrence  relation 


S, 


ra+l 


(12.106) 


which  is  a  consequence  of  the  Bessel  function  recurrence  formula  (11.111).  Indeed. 


dS. 


rn 


d.J. 


j[  UjUm-\-l/2  1  /  7T  1 


dx 


2x 


dx 


J 


2  V  2  x3/2  m+1/2 


(*) 


7 r 

2x 


Jr 


ra+3/2 


(*)  + 


m+. 


X 


ra+1/2 


(*) 


r  /  \  m  I  7T 

2^  ^m+ 3/2VX)  +  ~ff  y  2~f  Jrn+l/2\X) 


2  V  2  X 3/2  Jm+ 1/2  ^ 

m 

Sm+1{x)  +  —  Sm(x). 

th 


The  next  few  spherical  Bessel  functions  are,  therefore. 


S1(x)  =  — 
S2(x)  =  - 


dS, 


0 


dx 

dS 


cos  x  sm  x 

- +  — w 


dx 


i 

1  +  — 
x 


dS.  2  Sr 


S3(x)  —  —  — - b 

d  dx 


x 


x 

sinx 

x 

cosx 

x 


x * 

3  cos  x  3  sin  x 

- W-  +  - o 


(12. 10T) 


x * 
6  sinx 

x2 


15  cos  x  15  sin  x 

- o -  +  - 7 - 


X' 


X 


and  so  on.  Figure  11.4  provides  graphs  of  the  first  four  spherical  Bessel  functions  on  the 
interval  0  <  x  <  20;  the  vertical  axes  range  from  —.5  to  1.0.  We  note  that 


S0(0)  =  1,  whereas  ^m(0)  =  0 


for 


m  >  0. 


(12.108) 


whose  proof  is  the  task  of  Exercise  12.4.26.  Thus,  our  radial  solution  (12.103)  is,  apart 
from  an  inessential  constant  multiple,  a  rescaled  spherical  Bessel  function  of  order  m: 


P(r)  =  Sm(V\r) 
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Figure  12.8.  Spherical  Bessel  functions. 


So  far,  we  have  not  taken  into  account  the  (homogeneous)  Dirichlet  boundary  condition 
at  r  =  1.  This  requires 

p(  1)  =  0,  and  hence  Sm{VX)  =  0. 

Therefore,  y/X  must  be  a  root  of  the  mth  order  spherical  Bessel  function.  We  introduce  the 
notation 

^  ^  ^771,1  ^  ^771,2  ^  ^771,3  ^ 

to  denote  the  successive  (positive)  spherical  Bessel  roots ,  satisfying 

=  0  for  71=1,2,  ...  .  (12.109) 

In  particular  the  roots  of  the  zeroth  order  spherical  Bessel  function  S0(x)  —  x~l  sinx  are 
just  the  integer  multiples  of  i r: 


a0  n  =  n7T  for  71=1,2,  ...  . 

The  higher-order  roots  are  not  expressible  in  terms  of  known  constants.  A  table  of  all 
spherical  Bessel  roots  that  are  <13  appears  below.  The  columns  of  the  table  are  indexed 
by  m,  the  order,  while  the  rows  are  indexed  by  n,  the  root  number. 

Re- assembling  the  individual  constituents,  we  have  now  demonstrated  that  the  sepa¬ 
rable  eigenfunctions  of  the  Helmholtz  equation  on  a  solid  ball  of  radius  1,  when  subject 
to  homogeneous  Dirichlet  boundary  conditions,  are  products  of  spherical  Bessel  functions 
and  spherical  harmonics, 


Vk,m,n(r >  <P>  °)  =  Sm(am,n  r )  < 

(f,  6)  =  Sm(a  r)  Y£(cp,  9), 


m  =  0, 1,  2, . . .  , 
k  =  0, . . . ,  7tt, 
rt  —  1,2,3,... 


(12.110) 
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Spherical  Bessel  Roots  cr 

f  I  b  *  f  L 


The  corresponding  eigenvalues 


A  =  rrz 

/'m,n  um,n7 


nn  —  0,1,  2, 


n—  1,2,  3, 


(12.111) 


are  the  squared  spherical  Bessel  roots.  Since  there  are  2m  +  1  independent  spherical 
harmonics  of  order  m,  the  eigenvalue  Am  n  admits  2m  +  1  linearly  independent  eigenfunc¬ 
tions,  namely  v0  rn  nl . . .  ^rn  ^ In  particular,  the  radially  symmetric 

solutions  are  the  eigenfunctions  with  k  —  m  —  0 : 


vn(r)  =  v0,0,n(r)  =  So(aO,n  r ) 


sinnTr  r 
nir  t 


n=  1,2,...  .  (12.112) 


Further  analysis,  cf.  [34],  demonstrates  that  the  separable  solutions  (12.110)  form  a  com¬ 
plete  system  of  eigenfunctions  for  the  Helmholtz  equation  on  the  unit  ball  with  homoge¬ 
neous  Dirichlet  boundary  conditions. 

We  have  thus  completely  determined  the  basic  separable  solutions  to  the  heat  equation 
on  a  solid  unit  ball  subject  to  homogeneous  Dirichlet  boundary  conditions.  They  are 
products  of  exponential  functions  of  time,  spherical  Bessel  functions  of  the  radius,  and 
spherical  harmonics: 


Uk,m,n(t,r,V,6)  =  e  5m( 

^k,mAt'r'^>'e)  =  e~a™'nt  Sm{ 

The  general  solution  can  be  written  as  an  infinite 
series  in  these  fundamental  modes: 


am,n  r )  Ym(<p,0), 

°m,n  r )  Y£(<p,0). 


(12.113) 


“Fourier-Bessel-spherical  harmonic” 


oo  oo 


u(t,r,ip,6)  = 

m  —  0  n  —  1 


_  2  f 
g  °  m ,  n  L 


Q  (fj  r\ 

^ rn  \  J  rn.n  '  / 


C 


0,m,n  ^0 


Y»(<p,e) 


m 


(12.114) 


k=  1 


The  series’  coefficients  are  uniquely  prescribed  by  the  initial  data  u(0,r,ip,6)  =  f(r,ip,6), 
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and  their  explicit  formulae^ 
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(2m  +  l)(m  —  k) ! 
k’m’n  n(m  +  k)\  Sm+1(am  n)2 

(2m  +  l)(m  —  fc) ! 


*7T  /»7T  pi 


nl 

f(r ,  P-,  0)  vk  m  n{r,  p,  9)  r2  sin  p  dr  dp  d6, 


-7 T  <70  </0 

*7 r  /*7T  />1 


n± 

f(r,  p,  0)  vk  (r,  p,  9)  r2  sin  p  dr  dp  d&, 


%m’n  n(m  +  k)l  Sm+1(am  n)2 

(12.115) 

follow  from  the  usual  orthogonality  relations  among  the  eigenfunctions,  combined  with  the 
formulas 


v 


0  ,m,n 


^  ^  Q  /  \ 

2  m  +  1  ^?m+1 


V 


k,m,n 


V 


k,m,n 


7 x(m  +  k)  ! 


(12.116) 


(2m  +  l)(m  —  k ) 


fc  >  0. 


for  their  norms,  to  be  established  in  Exercise  12.4.29.  In  particular,  the  slowest-decaying 
mode  is  the  spherically  symmetric  function 


u 


0,0,1 


0,0 


—  7T2  t  • 

e  Lsmnr 


7 rr 

_ 


(12.117) 


corresponding  to  the  smallest  eigenvalue  A0  x  x  =  7r^.  Therefore,  typically,  the  decay 
to  thermal  equilibrium  of  a  unit  sphere  is  at  an  exponential  rate  of  tt2  9.8696,  or,  to  a 
very  rough  approximation,  10. 


Exercises 

12.4.13.  It  takes  a  solid  ball  of  radius  1  cm  ten  minutes  to  return  to  (approximate)  thermal 
equilibrium.  How  long  does  it  take  a  similar  ball  of  radius  2? 

12.4.14.  If  a  200-gram  potato  served  hot  from  the  oven  takes  15  minutes  until  its  maximum 
temperature  is  less  than  40°  C,  how  long  does  it  take  a  300-gram  potato  of  the  same  shape 
to  cool  off? 

T  12.4.15.  A  uniform  solid  metal  ball  of  radius  1  meter,  with  diffusion  coefficient  7  =  2,  is  taken 

from  a  300°  oven  and  immersed  in  a  bucket  of  ice  water,  (a)  Write  down  an  initial-boundary 
value  problem  that  describes  the  temperature  of  the  ball,  (b)  Find  a  series  solution  for 
the  temperature,  (c)  At  what  time  is  the  temperature  <  50°  throughout  the  ball? 

0  12.4.16.  Find  the  decay  rate  to  thermal  equilibrium  of  a  solid  spherical  ball  of  radius  R  and 
diffusion  coefficient  7  when  subject  to  homogeneous  Dirichlet  boundary  conditions. 

12.4.17.  True  or  false:  A  heated  solid  hemisphere  placed  in  a  0°  environment  cools  down  twice 
as  fast  as  a  solid  sphere  of  the  same  radius  made  out  of  the  same  material. 

12.4.18.  A  fully  insulated  solid  spherical  ball  of  radius  1  has  initial  temperature  distribution 

/(r,  (f,  0).  (a)  Write  down  a  formula  for  the  equilibrium  temperature  of  the  ball. 

(b)  What  is  the  rate  of  decay  of  the  ball  to  thermal  equilibrium? 


We  use  the  spherical  coordinate  form  of  the  L2  inner  product  on  the  ball. 
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12.4.19.  Which  cools  down  to  equilibrium  faster:  a  fully  insulated  solid  ball  or  one  whose  bound¬ 
ary  is  held  fixed  at  0°?  How  much  faster? 

12.4.20.  A  solid  sphere  and  solid  cube  are  made  out  of  the  same  material  and  have  the  same 
volume.  Both  are  heated  in  an  oven  and  then  submerged  in  a  large  vat  of  water.  Which 
will  cool  down  faster?  Explain  and  justify  your  answer. 

12.4.21.  Answer  Exercise  12.4.20  when  the  two  solids  have  the  same  surface  area. 

12.4.22.  Suppose  the  solid  spherical  shell  in  Exercise  12.2.7  starts  off  at  room  temperature. 
Assuming  that  the  water  in  the  center  remains  at  100°,  find  the  rate  at  which  the  shell 
tends  to  thermal  equilibrium. 


T  12.4.23.  The  thermodynamics  of  a  thin,  uniform,  spherical  shell  of  unit  radius  is  governed  by 
the  spherical  heat  equation  ut  =  7  A su,  u(0,  (p,  #)  =  /((p,$),  in  which  As  is  the  spherical 

Laplacian  (12.19).  The  solution  u(£,  ip,  9)  represents  the  temperature  of  the  point  on  the 
unit  sphere  with  angular  coordinates  ip,  9,  while  f{ip,9)  is  the  initial  temperature  distribu¬ 
tion.  (a)  Find  the  eigensolutions.  (b)  Write  down  the  solution  to  the  initial  value  problem 
as  a  series  in  eigensolutions.  (c)  What  is  the  final  equilibrium  temperature  of  the  spherical 
shell?  (d)  What  is  its  rate  of  decay  to  equilibrium?  (e)  Find  the  solution  and  the  final 
equilibrium  temperature  when  f(ip,  0)  =  {i)  sin  ip  cos  0;  (ii)  cos  2  ip. 

12.4.24.  A  spherical  potato,  of  radius  R  =  7.5  cm  and  thermal  diffusivity  7  =  .3cm2/sec,  is 
initially  at  room  temperature,  25°  C,  and  is  placed  in  a  pot  of  boiling  water  at  100°  C. 

The  potato  is  cooked  when  it  has  reached  the  temperature  of  at  least  90°  C  throughout. 
How  long  do  you  have  to  wait  until  the  potato  is  done? 


12.4.25.  (a)  Explain  why  the  spherical  Bessel  function  S±(x)  is  bounded  at  x  =  0. 

What  is  5^(0)?  (b)  Answer  the  same  question  for  S2(x). 

12.4.26.  Prove  the  formulae  (12.108). 

0  12.4.27.  (a)  Find  a  recurrence  relation  expressing  the  spherical  Bessel  function  Srn_1(x)  in 
terms  of  Sm(x).  (b)  Prove  that 

f  [^(^(^-VlW^m+lW)]  =2  x2Sm{x)2. 


12.4.28.  Let  m  >  0  be  a  fixed  integer,  (a)  Prove  that  the  rescaled  spherical  Bessel  functions 
vn(r)  =  ^m(°m  nr)’  n  =  1?  2, . . .  ,  are  mutually  orthogonal  under  the  inner  product 

{fid)  —  Jq  f(r)g{r)r2  dr.  (b)  Prove  that  ||  vn  ||  =  |  £m+1((rm  n)  |-  Hint:  Mimic  the 

method  outlined  in  Exercise  11.4.22,  using  the  identity  in  Exercise  12.4.27(b). 

O 

0  12.4.29.  (a)  Use  the  result  of  Exercise  12.4.28  to  prove  the  formulae  (12.116)  for  the  L  norms 
of  the  eigenfunctions  (12.110).  (b)  Justify  the  formulae  (12.115). 


The  Fundamental  Solution  to  the  Heat  Equation  in  Space 

For  the  heat  equation  (as  well  as  more  general  diffusion  equations),  the  fundamental 
solution  measures  the  response  of  the  body  to  an  instantaneously  applied  concentrated 
unit  heat  source.  Thus,  given  a  point  £  =  (£,  77,  £)  €=  ^  within  the  body,  the  fundamental 
solution 

u(t,  x)  =  F(t,  x;  £)  =  F(t,  x,  y,  z;  £,  rj,  () 
solves  the  initial-boundary  value  problem 

ut  =  An,  u(0,  x)  =  d(x  —  £),  for 


xgO,  t  >  0, 


(12.118) 
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subject  to  the  selected  homogeneous  boundary  conditions  —  Dirichlet,  Neumann,  or  mixed. 

Explicit  formulas  for  the  fundamental  solution  are  rare,  although  in  bounded  domains 
it  is  possible  to  construct  it  as  an  eigenfunction  series,  as  described  in  Section  9.5.  The 
one  case  amenable  to  a  complete  analysis  is  that  in  which  the  heat  is  distributed  over 
all  of  three-dimensional  space,  so  Q  =  M3.  We  recall  that  Lemma  11.11  showed  how  to 
construct  solutions  of  the  two-dimensional  heat  equation  as  products  of  one-dimensional 
solutions.  In  a  similar  manner,  if  p(t,  x),  q(t,  x),  and  r(t,x)  are  any  three  solutions  to  the 
one-dimensional  heat  equation  ut  =  ~fuxxl  then  their  product 

u(t,  x ,  y,  z)  —  p(t,  x )  q(t,  y)  r(t,  z)  (12.119) 

is  a  solution  to  the  three-dimensional  heat  equation 

Ut=  7  iUxx  +  Uyy  +  uzz)- 

In  particular,  choosing 

e-0-£)2/47 1  e-{y-v)2/  47^  e-0-C)2/47 1 

=  — 0  — t — ,  q(t,y)  =  — 0  , — t — ,  r(t,z)  =  —  - — - — , 

2v/7T7t  2  yGry t  2y/7T7t 

to  all  be  one-dimensional  fundamental  solutions,  we  are  immediately  led  to  the  fundamental 
solution  in  the  form  of  a  three-dimensional  Gaussian  filter. 

Theorem  12.14.  The  fundamental  solution 

p~  II  x-£  H2/(47P 

F(t,x;0  =  C(t,x-0=  8(7r7t)3/2  (12'12°) 

solves  the  three-dimensional  heat  equation  ut  =  7  A u  on  M3  for  t  >  0,  with  an  initial 
temperature  equal  to  a  delta  function  concentrated  at  the  point  x  =  £. 

Thus,  the  initially  concentrated  heat  energy  immediately  begins  to  spread  out  in  a 
spherically  symmetric  manner,  with  a  minuscule,  but  nonzero  effect  that  is  felt  immediately 
arbitrarily  far  away  from  the  initial  concentration.  At  each  individual  point  x  G  M3,  after 
an  initial  warm-up,  the  temperature  decays  back  to  zero  at  a  rate  proportional  to  £~3/2 
—  more  rapidly  than  in  two  dimensions,  because,  intuitively,  there  are  more  directions  in 
which  the  heat  energy  can  disperse. 

To  solve  a  more  general  initial  value  problem  with  the  initial  temperature  distributed 
over  all  of  space,  we  first  write 


w(0,  x)  =  /(x)  =  JJ J  f(i)  S(x~e  dr]  d( 

as  a  linear  superposition  of  delta  functions.  By  linearity,  the  solution  to  the  initial  value 
problem  is  given  by  the  corresponding  superposition 


u(t,  x) 


1 


/(O  e' 


X 


/(47b 


(12.121) 


8  (7T7 1)3/2 

of  the  fundamental  solutions.  Since  the  fundamental  solution  has  exponential  decay  as 
-G  00,  the  superposition  formula  is  valid  even  for  initial  temperature  distributions 


x 


that  are  moderately  increasing  at  large  distances.  We  remark  that  the  integral  (12.121) 
has  the  form  of  a  three-dimensional  convolution 


u{t,x)  =  F(t,  x)  *  /(x) 


f{C  F{t,x-  £)d£dri  d( 


(12.122) 
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of  the  initial  data  with  a  one-parameter  family  of  increasingly  spread-out  Gaussian  filters. 
Thus,  as  before,  convolution  with  a  Gaussian  filter  has  a  smoothing  effect  on  the  initial 
temperature  distribution. 


Exercises 

12.4.30.  True  or  false:  In  a  three-dimensional  medium,  heat  energy  propagates  at  infinite  speed. 

12.4.31.  A  solid  spherical  ball  of  radius  1  is  heated  to  100°  and  inserted  into  a  three-dimen- 

Q 

sional  medium  filling  the  rest  of  with  uniform  temperature  0°. 

Q 

(a)  Write  down  an  integral  formula  for  the  subsequent  temperature  distribution  over  R  at 
time  t  >  0,  assuming  a  common  diffusion  coefficient  7  =  1. 

(b)  Evaluate  the  resulting  integral  using  spherical  coordinates. 

12.4.32.  (a)  Prove  that  u(t,  r)  is  a  spherically  symmetric  solution  to  the  three-dimensional  heat 
equation  if  and  only  if  w(t,r)  =  ru(t,r)  solves  the  one-dimensional  heat  equation:  wt  =  wrr. 
(b)  True  or  false:  If  ie(t,r)  is  the  fundamental  solution  for  the  one-dimensional  heat 
equation  based  at  r  =  0,  then  u(t,r)  =  w(t,r)/r  is  the  fundamental  solution  for  the 
three-dimensional  heat  equation  based  at  the  origin. 

12.4.33.  Construct  the  solution  to  the  initial  value  problem  in  Exercise  12.4.31  using  radial 
symmetry  and  Exercise  12.4.32. 

C  12.4.34.  Suppose  that,  as  Earth  orbits  the  sun,  its  surface  is  subject  to  yearly  periodic 

temperature  variations  a  cos  cat,  where  the  frequency  ce  is  given  by  (4.56).  (a)  Assuming, 
for  simplicity,  that  the  Earth  is  a  homogeneous  solid  ball,  of  radius  it,  formulate  an  initial¬ 
boundary  value  problem  that  governs  the  temperature  fluctuations  within  the  Earth  due  to 
its  orbiting  the  sun.  (b)  At  what  depth  does  the  temperature  vary  out  of  phase  with  the 
surface,  i.e.,  is  the  warmest  in  winter  and  coldest  in  summer?  Compare  your  answer  with 
the  root  cellar  computation  at  the  end  of  Section  4.1.  Hint:  Use  Exercise  12.4.32. 

12.4.35.  (a)  Prove  that  if  u(t,x)  is  any  (sufficiently  smooth)  solution  to  the  heat  equation,  so  is 
its  time  derivative  v  =  du/dt.  (b)  Write  out  the  time  derivative  of  the  fundamental 
solution,  and  the  initial  value  problem  it  satisfies. 

12.4.36.  Write  down  an  explicit  eigenfunction  series  for  the  fundamental  solution  F(t,x;£)  to 
the  heat  equation  in  a  unit  cube  with  thermal  diffusivity  7  =  1  that  is  subject  to  homoge¬ 
neous  Dirichlet  boundary  conditions. 

12.4.37.  Write  down  an  explicit  eigenfunction  series  for  the  fundamental  solution  F(t,x;£)  to 
the  heat  equation  in  a  ball  of  radius  1  that  has  thermal  diffusivity  7  =  1  and  is  subject  to 
homogeneous  Dirichlet  boundary  conditions. 

0  12.4.38.  Justify  the  statement  that  formula  (12.119)  provides  a  solution  to  the  three-dimen¬ 
sional  heat  equation. 

12.4.39.  Fill  in  the  details  of  the  proof  of  Theorem  12.14. 


12.5  The  Wave  Equation  for  Three-Dimensional  Media 

The  three-dimensional  wave  equation 

utt  =  c2Au  =  c2(uxx  +  uyy  +uzz), 


(12.123) 
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in  which  c  >  0  denotes  the  speed  of  light,  governs  the  propagation  of  waves  in  a  homoge¬ 
neous  isotropic  three-dimensional  medium,  e.g.,  electromagnetic  waves  (light,  X-rays,  radio 
waves,  etc.)  in  empty  space.  In  this  context,  while  the  electric  and  magnetic  vector  fields 
E,  B  are  intrinsically  coupled  by  the  more  complicated  system  of  Maxwell’s  equations,  each 
individual  component  satisfies  the  wave  equation;  see  Exercise  12.5.14  for  details. 

The  wave  equation  also  models  certain  restricted  classes  of  vibrations  of  a  uniform 
solid  body.  The  solution  n(£,x)  =  u(t,x,y,z)  represents  a  scalar- valued  displacement  of 
the  body  at  time  t  and  position  x  =  (x,y,z)  E  Q  C  M3.  For  example,  u(t,  x)  might 
represent  the  radial  displacement  of  the  body.  One  imposes  suitable  boundary  conditions, 
e.g.,  Dirichlet,  Neumann,  or  mixed,  on  90,  along  with  a  pair  of  initial  conditions 

«(0,x)  =  /(x),  —  (0,  x)  =  p(x),  xef2,  (12.124) 

that  specify  the  body’s  initial  displacement  and  initial  velocity.  As  long  as  the  initial  and 
boundary  data  are  reasonably  nice,  there  exists  a  unique  classical  solution  to  the  initial¬ 
boundary  value  problem  for  all  — oo  <  t  <  oo,  cf.  [38,61,99].  Thus,  in  contrast  to  the 
heat  equation,  one  can  follow  solutions  to  the  wave  equation  both  forwards  and  backwards 
in  time. 

Let  us  focus  our  attention  on  the  homogeneous  boundary  value  problem.  The  funda¬ 
mental  vibrational  modes  are  found  by  imposing  our  usual  trigonometric  ansatz 


u{t,  x,  y,  z)  =  cos(cct)  v(x,  y,  z)  or  sin (ut)  v(x,y,z). 

Substituting  into  the  wave  equation  (12.123),  we  discover  (yet  again)  that  v(x1y1z)  must 
be  an  eigenfunction  for  the  associated  Helmholtz  eigenvalue  problem 

cc2 

Ai;  +  A'C  =  0,  where  A  =  — - ,  (12.125) 

cz 

coupled  to  the  relevant  boundary  conditions.  In  the  positive  definite  cases,  i.e.,  Dirichlet 
and  mixed  boundary  conditions,  the  eigenvalues  Xk  =  ce2/c2  >  0  are  all  positive.  Each 
eigenfunction  vk(x,y,z)  yields  two  normal  vibrational  modes 

uk{t,  x,  y,  z)  =  cos (ukt)  vk{x,  y,  z),  uk(t ,  x,  y,  z)  =  sin (ukt)  vk{x,  y,  z), 

of  frequency  txk=  c  VXk  equal  to  the  square  root  of  the  corresponding  eigenvalue  multiplied 
by  the  wave  speed.  The  general  solution  is  a  quasiperiodic  linear  combination, 


oo 


u(t,  x,  y,  z)  =  y;  [ak  cos (ukt)  +  bk  sin (ukt)  ]  vk (x,  y,  z) 


(12.126) 


k=  1 


of  the  eigenmodes.  The  coefficients  ak,bk  are  uniquely  prescribed  by  the  initial  conditions 
(12.124).  Thus, 

oo 

u(0,x,y,z)  =  akvk(x’V’z)  =  f(x,y,z), 

k=  1 


du 

dt 


oo 

(o ,x,y,z)  =  Y  ukbkvk{x,y,z)  =  g(x,y,z). 

k=  1 
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The  explicit  formulas  follow  immediately  from  the  orthogonality  of  the  eigenfunctions: 


a 


f 


k 


/  /  /  /  dx  dy  dz 

J  J  Jn 
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V 
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K  = 


1  (9,v 
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g  vk  dx  dy  dz 
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vk  dx  dy  dz 
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n 

vk  dx  dy  dz 


(12.127) 

In  the  positive  semi-definite  Neumann  case,  there  is  an  additional  zero  eigenvalue 
A0  =  0  corresponding  to  the  constant  null  eigenfunction  v0(x,  y,z)  =  1.  This  results  in  two 
additional  terms  in  the  eigenfunction  expansion  —  a  constant  term 


ao  ~ 


vol  O 


/(x,  y ,  z)  dx  dy  dz 


that  equals  the  average  initial  displacement,  and  an  unstable  mode  b0  t  that  grows  linearly 
in  time,  whose  speed 

60  =  — — —  j  g(x1y,z)dxdydz 

J  Jn 


vol  Q 


is  the  average  initial  velocity  over  the  entire  body.  Thus,  the  unstable  mode  will  be  excited 
if  and  only  if  there  is  a  nonzero  net  initial  velocity:  b0  ^  0. 

Most  of  the  basic  solution  techniques  we  learned  in  the  two-dimensional  case  apply 
here,  and  we  will  not  dwell  on  the  details.  The  case  of  a  rectangular  box  is  a  particularly 
straightforward  application  of  the  method  of  separation  of  variables,  and  is  outlined  in  the 
exercises.  A  similar  analysis,  now  in  cylindrical  coordinates,  can  be  applied  to  the  case  of 
a  vibrating  cylinder.  The  most  interesting  case  is  that  of  a  solid  spherical  ball,  which  is 
the  subject  of  the  next  subsection. 


Vibration  of  Balls  and  Spheres 


Let  us  focus  on  the  radial  vibrations  of  a  solid  ball,  as  modeled  by  the  three-dimensional 
wave  equation  (12.123).  The  solution  u(t,x,y,z)  represents  the  radial  displacement  of  the 
“atom”  that  is  situated  at  position  (x,y,z)  when  the  ball  is  at  rest. 

For  simplicity,  we  look  at  the  Dirichlet  boundary  value  problem  on  the  unit  ball 
Bi  =  dixi  <  1 }.  The  normal  modes  of  vibration  are  governed  by  the  Helmholtz  equation 
(12.125)  subject  to  homogeneous  Dirichlet  boundary  conditions.  According  to  (12.110), 
the  eigenfunctions  are 


(12.128) 


v0,m,n(r,^9)  =  Sm(amtnr)Y^,9), 

Vk,m,n  X(P,°)  =  Sn  ( <?n,m  r )  0 ,  for 

Vk,m,n (r,<P,0)  =  Sm  (cfm  n  r)  (<p,0), 

Here  Srn  denotes  the  mth  order  spherical  Bessel  function  (12.104),  <rm  n  is  its  nth  root,  as 
in  (12.109),  while  Yfn ,  Yr[n  are  the  spherical  harmonics  (12.38).  Each  eigenvalue 


n  —  1,2,  3, 


m  —  0, 1,  2, . . . 

k  —  1,2,...,  m. 


A  =  rr2 

/'m,n  um,n'> 


m  —  0, 1,  2, 


n  —  1,2,3, . . . , 


corresponds  to  2m  +  1  independent  eigenfunctions,  namely 


vk,mfi(.r^X),  vk,m,i  (r,<p,0),  ,  Vk  (r,<p,0),  vkml  (r,<p,0),  ...  ,  vk  (r,  <p,  9). 
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Consequently,  the  fundamental  vibrational  frequencies  of  a  solid  ball 

^ m,n  ^  ^m,n  ^  ^ m,n ">  ^  0,  1,  2,  .  .  .  ,  Tl  1,  2,  3,  ,  (12.129) 

are  equal  to  the  spherical  Bessel  roots  am  n  multiplied  by  the  wave  speed.  There  is  a 
total  of  2(2  m+  1)  independent  vibrational  modes  associated  with  each  distinct  frequency 
(12.129),  namely 

U0,m,n(tirilPie)  =  COS(COm,V)  Sm(am,n  r)  9), 

“o =sin(^ra,»<)  Sm{amtnr)Y^{(p,9), 

n  —  12  3 

Uk,m,n(^r^^)  =COS(Cam,nt)  Sm((Tmtnr)Y^(ip,e), 

m  =  0,1,2,...,  (12.130) 

%,m,n r><^, 0)  =  sin(c (Tmn t)  ^ ((Tmn r)  6») ,  ,  _  „ 

k  =  1,2,...,  m. 

=cos{camnt)  Sm{amnr)Y^((p,e), 

=sin(CCrm,nt) 

In  particular,  the  radially  symmetric  modes  of  vibration  have,  according  to  (12.105),  the 
elementary  form 

.  coscnirt  sinner 

^0,0, nViViO)  =  cos(cnTTt)  S0{mrr)  =  - , 

r  n  =  1,2,3,...  .  (12.131) 

_  f  .  \  ry  /  \  sincnirt  sinnyrr 

^o,o, n(C  0)  =  sin(cn7r  t)  SQ(rnrr)  =  - - - , 

Their  vibrational  frequencies,  uj0  n  =  cnn ,  are  integral  multiples  of  the  lowest  frequency 
ce0  i  =  C7ir.  Thus,  intriguingly,  if  you  excite  only  the  radially  symmetric  modes,  the  resulting 
motion  of  the  ball  is  periodic.  However,  more  general  vibrations  are  only  quasiperiodic. 

Adopting  the  same  scaling  argument  as  in  (11.166),  we  conclude  that  the  fundamental 
frequencies  for  a  solid  ball  of  radius  R  and  wave  speed  c  are  given  by  cem  n  =  ccrm  n/R. 
The  relative  vibrational  frequencies 


bJ  <J  ( 7 

m,n  _  rn.n  _  m,n 

^0,1  ^0,1  71 

are  independent  of  the  size  of  the  ball  R  or  the  wave  speed  c.  In  the  accompanying  table, 
we  display  all  relative  vibrational  frequencies  that  are  <  4  in  magnitude. 


Relative  Spherical  Bessel  Roots  crm  n/<r0  1 


n 


m 


0 


6 


7 


8 
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The  purely  radial  modes  of  vibration  (12.131)  have  individual  frequencies 


U7T  C 


(jJ 


(jJ 


0  ,n 


R 


so 


0,71 


n. 


uj 


0,1 


which  appear  in  the  first  column  of  the  table.  The  lowest  frequency  is  ce0  x  =  irc/R,  cor¬ 
responding  to  a  vibration  with  period  2tt/uj0  1  =  2  R/c.  In  particular,  for  the  Earth,  the 
radius  R  6000  km,  and  the  wave  speed  in  rock  is,  on  average,  c^5  km/sec,  so  that  the 
fundamental  mode  of  vibration  has  period  2  R/cze  2400  seconds,  or  40  minutes.  Of  course, 
we  have  suppressed  almost  all  interesting  terrestrial  geology  in  this  very  crude  approxima¬ 
tion,  which  has  been  based  on  the  assumption  that  the  Earth  is  a  uniform  spherical  body, 
globally  vibrating  only  in  its  radial  direction.  A  more  realistic  modeling  of  the  vibrations  of 
the  Earth  requires  an  understanding  of  the  basic  partial  differential  equations  of  linear  and 
nonlinear  elastodynamics,  [7,  49].  Nonnniformities  in  the  Earth  lead  to  scattering  of  the 
vibrational  waves,  which  are  then  used  to  locate  subterranean  geological  structures,  e.g.,  oil 
and  gas  deposits.  Localized  vibrations  of  the  Earth  are  also  known  as  seismic  waves ,  and, 
of  course,  earthquakes  are  their  most  severe  manifestation.  We  refer  the  interested  reader 
to  [5]  for  an  introduction  to  mathematical  seismology.  Understanding  terrestrial  vibrations 
is  an  issue  of  critical  importance  in  geophysics  and  civil  engineering,  including  the  design 
of  structures,  buildings,  and  bridges,  requiring  the  avoidance  of  potentially  catastrophic 
resonant  frequencies. 

Example  12.15.  The  radial  vibrations  of  a  hollow  thin  spherical  shell  (e.g.,  an  elastic 
balloon)  are  governed  by  the  differential  equation 


d2u 
dt 2 


A 


si 


u 


d2u  cos  p  du  1 

- 1 - — - 1 - 

dp2  sin  p  dp  sin2  p 


(12.133) 


where  As  denotes  the  spherical  Laplacian  (12.19).  The  radial  displacement  u(t,p,9)  of  a 
point  on  the  sphere  depends  only  on  time  t  and  the  angular  coordinates  p,9.  The  solution 
u(£,  (/?,  9)  is  required  to  be  27r-periodic  in  the  azimuthal  angle  9  and  bounded  at  the  poles, 
where  p  =  0  and  n. 

According  to  (12.38),  the  nth  eigenvalue  of  the  spherical  Laplacian,  An  =  n(n  +  1), 
possesses  2n+  1  linearly  independent  eigenfunctions,  namely,  the  spherical  harmonics 


YZW),  YnV,0), 


YZ{<P,9) 


As  a  consequence,  the  fundamental  frequencies  of  vibration  for  a  spherical  shell  are 


w>n  =  c  \J ~\~n  =  c  y/n{n  +  l)  ,  n  =  1,2,....  (12.134) 

The  vibrational  solutions  are  qnasiperiodic  combinations  of  the  fundamental  spherical  har¬ 
monic  modes 

cos(yn(n+  1)  t)  Y™((p,d),  sin (^n{n+  1)  t)  Y™(tp,d), 

cos(yn(n  +  1)  t)  Y™{lp,6),  sin(v/n(n  +  1)  t )  Y™(<p,0). 

Representative  graphs  can  be  seen  in  Figure  12.5.  The  smallest  positive  eigenvalue  is 
Ax  =  2,  yielding  a  lowest  tone  of  frequency  u1  =  cy/2.  The  higher-order  frequencies  are 
irrational  multiples  of  the  fundamental  frequency,  implying  that  a  vibrating  spherical  bell 
sounds  dissonant  to  onr  ears. 

One  further  remark  is  in  order.  The  spherical  Laplacian  operator  is  only  positive  semi- 
definite,  since  the  lowest  mode  has  eigenvalue  A0  =  0,  which  corresponds  to  the  constant 
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null  eigenfunction  vo((p,0)  =  Yq°((/?,$)  =  1.  Therefore,  the  wave  equation  (12.133)  admits 
an  unstable  mode  60  0  £,  corresponding  to  a  uniform  radial  inflation;  its  coefficient 

Vo  =  h  //Sl  w (0^- e>  dS 

represents  the  shell’s  average  initial  velocity.  The  existence  of  such  an  unstable  mode  is  an 
artifact  of  the  simplified  linear  model  we  are  using,  which  fails  to  account  for  nonlinearly 
elastic  effects  that  serve  to  constrain  the  inflation  of  a  spherical  balloon. 


Exercises 

12.5.1.  Find  the  eigenfunction  series  solution  to  the  initial-boundary  value  problem  for  the 
wave  equation  utt  =  A  u  on  a  unit  cube  C  =  {0  <  x,y,  z  <  1},  subject  to  homogeneous 
Dirichlet  boundary  conditions  and  one  of  the  following  sets  of  initial  conditions: 

(a)  a(0,  x,  y,  z)  —  1,  iq(0,  x,  y,  z)  =  0;  (b)  u(0,  x,  y,  z)  =  0,  ut  (0,  x,  y,  z)  =  1; 

(c)  u(0,  x,y,  z)  =  sin  7 tx  sin 7 xy  sin7rz,  ut(b,  x,y,  z)  =  0;  (d)  u(0,  x,y,  z)  =  sin37rx, 
iq(0,  x,  y,  z)  =  sin  27ry;  (e)  a(0,  x,  y,  z)  =  0,  at(0,  x,  y,  z)  =  xyz  (1  —  x)(l  —  y)(l  —  z). 

12.5.2.  Suppose  the  cube  in  Exercise  12.5.1  is  subject  to  homogeneous  Neumann  boundary 
conditions.  Which  of  the  preceding  initial  value  problems  leads  to  an  unstable  motion  of 
the  cube? 

12.5.3.  (a)  Find  the  separable  periodic  vibrations  of  a  unit  cube  subject  to  homogeneous 
Dirichlet  boundary  conditions,  (b)  Can  you  find  a  periodic  mode  that  is  not  separable? 

12.5.4.  Answer  Exercise  12.5.3  when  one  face  of  the  cube  is  left  free,  while  the  other  five  faces 
are  fixed. 

12.5.5.  Given  a  material  with  wave  speed  c  =  1.5  cm/sec,  find  the  natural  vibrational  frequen¬ 
cies  of  a  solid  rectangular  box  of  size  1  cm  x  2  cm  x  3  cm  whose  sides  are  held  fixed.  List 
the  lowest  five  such  frequencies  in  order.  Does  the  box  vibrate  periodically? 

12.5.6.  Find  the  natural  vibrational  frequencies  of  a  solid  cylinder  of  height  2,  radius  1,  and 
wave  speed  c  —  1,  when  (a)  all  sides  are  fixed;  (b)  the  top  and  bottom  of  the  cylinder  are 
free,  while  the  curved  side  is  fixed;  (c)  the  curved  side  of  the  cylinder  is  free,  while  the  top 
and  bottom  are  fixed. 

12.5.7.  Among  all  solid  cylinders  of  unit  volume  with  fixed  boundary,  find  the  one  that  vibrates 
the  slowest. 

12.5.8.  Does  a  solid  spherical  ball  that  is  subject  to  homogeneous  Neumann  boundary  condi¬ 
tions  vibrate  (z)  faster,  (ii)  slower,  or  (in)  at  the  same  rate  as  the  same  ball  subject  to 
homogeneous  Dirichlet  conditions.  If  your  answer  is  (i)  or  (ii),  estimate  how  much  faster 
or  slower. 

12.5.9.  A  solid  cube  and  solid  sphere  are  made  of  the  same  material  and  have  the  same  volume. 
Which  vibrates  faster  when  subject  to  homogeneous  Dirichlet  boundary  conditions? 

12.5.10.  Assuming  that  they  both  have  the  same  wave  speed  and  fixed  boundaries,  which 
vibrates  faster:  a  solid  sphere  or  a  circular  membrane  of  the  same  radius? 

12.5.11.  A  uniform,  solid  spherical  planet  is  floating  freely  in  outer  space.  Find  its  three 
slowest  resonant  frequencies. 

12.5.12.  True  or  false:  Suppose  we  have  two  uniform  solid  bodies  composed  of  the  same 
material.  If  the  first  body  cools  down  to  thermal  equilibrium  the  fastest,  then  it  also 
vibrates  the  fastest.  Explain  your  answer. 
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12.5.13.  (a)  Define  what  is  meant  by  a  nodal  curve  and  a  nodal  region  on  a  vibrating  thin 
spherical  shell,  (b)  True  or  false:  All  the  nodal  curves  are  arcs  of  circles. 


? ?  12.5.14.  The  propagation  of  electromagnetic  waves  (including  light)  is  governed  by  the  electric 
field  E(£,x)  and  magnetic  field  B(£,x),  which  are  both  time-dependent  vector  fields  defined 

for  x  =  (x,y,z)  in  a  domain  D  C  M  .  In  empty  space,  Maxwell’s  equations  (as  formulated 
by  Heaviside)  are 


V  •  E  —  0,  V  •  B  —  0, 


dB 

~dt 


V  x  E, 


dB 

~dt 


- V  x  B, 

Mo  e0 


(12.136) 


where  /i0,e 0  are,  respectively,  the  permeability  and  permittivity  constants.  Prove  that  all  in¬ 
dividual  components  of  E  and  B  satisfy  the  scalar  wave  equation.  What  is  the  wave  speed, 
i.e.,  the  speed  of  light  in  empty  space? 


12.6  Spherical  Waves  and  Huygens’  Principle 

For  any  dynamical  partial  differential  equation,  the  fundamental  solution  measures  the 
effect  of  applying  an  instantaneous  concentrated  unit  impulse  at  a  single  point.  Two 
representative  physical  effects  to  keep  in  mind  are  the  light  waves  emanating  from  a  sudden 
concentrated  blast,  e.g.,  a  lightning  bolt  or  a  stellar  supernova,  and  the  sound  waves 
due  to  an  explosion  or  thunderclap,  propagating  in  air  at  a  much  slower  speed.  Linear 
superposition  utilizes  the  fundamental  solution  to  build  up  more  general  solutions  to  initial 
value  problems.  For  the  wave  and  other  second-order  vibrational  equations,  the  impulse 
can  be  applied  either  to  the  initial  displacement  or  to  the  initial  velocity,  resulting  in  two 
distinct  types  of  fundamental  solution.  The  general  solution  to  the  initial  value  problem  will 
be  obtained  by  a  double  superposition.  In  this  section,  we  derive  explicit  formulas  for  the 
two  fundamental  solutions  for  the  three-dimensional  wave  equation  on  all  of  space,  leading 
to  KirchhofPs  formula  for  the  solution  to  the  general  initial  value  problem.  An  important 
consequence  is  Huygens’  Principle,  which  states  that,  in  three-dimensional  space,  localized 
initial  disturbances  remain  localized  as  they  propagate.  In  the  final  subsection,  we  apply 
the  method  of  descent  to  our  three-dimensional  solution  formulas  in  order  to  solve  the 
two-dimensional  wave  equation,  for  which,  surprisingly,  Huygens’  Principle  is  no  longer 
valid. 


Spherical  Waves 


In  a  uniform  isotropic  medium,  an  initial  concentrated  blast  results  in  a  spherically  ex¬ 
panding  wave,  moving  away  at  the  speed  of  light  (or  sound)  in  all  directions.  Invoking 
translation  invariance,  we  will  assume,  without  loss  of  generality,  that  the  source  of  the 
disturbance  is  at  the  origin,  and  so  the  solution  n(£,x)  should  depend  only  on  the  distance 
r  —  ||  x  ||  from  the  source.  We  adopt  spherical  coordinates  and  look  for  a  solution  u  =  u(t,  r ) 
to  the  three-dimensional  wave  equation  (12.123)  with  no  angular  dependence.  Substituting 
the  formula  (12.16)  for  the  spherical  Laplacian  and  setting  both  angular  derivatives  to  0, 
we  are  led  to  the  partial  differential  equation 


d2u 
dt 2 


2  du 
r  dr 


? 


(12.137) 
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which  governs  the  propagation  of  spherically  symmetric  waves  in  three-dimensional  space. 
Surprisingly,  we  can  explicitly  solve  (12.137).  The  secret  is  to  multiply  both  sides  of  the 
equation  by  r: 


Thus,  the  function 


w(t,  r)  =  r  u(t,  r ) 


satisfies  the  one-dimensional  wave  equation 


d2w 
dt 2 


d2w 
dr 2 


(12.138) 


According  to  Theorem  2.14,  the  general  solution  to  the  one-dimensional  wave  equa¬ 
tion  (12.138)  can  be  written  in  d’Alembert  form 


w(t,  r )  =  p(r  —  ct )  +  q{r  +  ct), 

where  p(£)  and  q(rj)  are  arbitrary  functions  of  a  single  characteristic  variable.  Therefore, 
spherically  symmetric  solutions  to  the  three-dimensional  wave  equation  assume  the  form 

u(t,r)  =  +  «fc±£*).  (12.139) 

The  first  summand, 

n(t,r)  =  p(r~Ct)  ,  (12.140) 

represents  a  wave  moving  at  speed  c  in  the  direction  of  increasing  r,  and  so  describes  the 
illumination  from  a  variable  light  source  that  is  concentrated  at  the  origin,  e.g.,  a  pulsating 
quasar  in  interstellar  space.  To  highlight  this  interpretation,  let  us  concentrate  on  the  case 
that  p(£)  =  5 —  a)  is  a  delta  function,  keeping  in  mind  that  more  general  solutions  can 
then  be  assembled  by  linear  superposition.  The  induced  solution 

x  S(r-ct-a)  S(r-c(t-tn, ))  ,  ^  a 

u(t,r)  =  -± - - - Z  =  — - - - —  ,  where  t0  = - ,  (12.141) 

represents  a  spherical  wave  propagating  through  space.  At  the  instant  t  =  £0,  the  light  is 
entirely  concentrated  at  the  origin,  r  —  0.  The  signal  then  moves  away  from  the  origin 
in  all  directions  at  speed  c.  At  each  later  time  t  >  £0,  the  wave  remains  concentrated  on 
the  surface  of  a  sphere  of  radius  r  =  c(t  —  t0).  Its  intensity  at  each  point  on  the  sphere, 
however,  has  decreased  by  a  factor  1/r,  and  so,  the  farther  the  light  travels  away  from  the 
source,  the  dimmer  it  becomes.  A  stationary  observer  sitting  at  a  fixed  point  in  space  will 
see  only  an  instantaneous  flash  of  light  of  intensity  1/r  as  the  spherical  wave  passes  by 
at  time  t  =  t0  +  r/c,  where  r  is  the  observer’s  distance  from  the  light  source.  A  similar 
statement  holds  for  sound  waves  —  to  an  observer,  the  sound  of  a  distant  explosion  will 
last  momentarily.  Thunder  and  lightning  are  the  most  familiar  examples  of  this  everyday 
phenomenon. 

On  the  other  hand,  for  t  <  t0,  the  impulse  is  concentrated  at  a  negative  radius  r  = 
c  (t  —  £0)  <  0.  To  interpret  this,  note  that,  for  spherical  coordinates  (12.15),  replacing  r 
by  —  r  has  the  same  effect  as  changing  x  to  the  antipodal  point  —  x.  Thus,  the  solution 
(12.141)  represents  a  concentrated  spherically  symmetric  light  wave  arriving  from  the  edges 
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of  the  universe  at  speed  c  that  strengthens  in  intensity  as  it  collapses  into  the  origin  at 
t  =  t0.  After  collapse,  it  immediately  reappears  and  expands  back  out  into  the  universe. 

The  second  solution  in  the  d’Alembert  formula  (12.139)  has,  in  fact,  exactly  the  same 
physical  form  under  the  antipodal  identification.  Indeed,  if  we  set 


r  =  -r,  p(Q  =  -q(-Q, 


then 


q(r  +  ct ) 
r 


Thus,  the  second  d’Alembert  solution  is  redundant,  and  we  only  need  to  consider  solutions 
of  the  form  (12.140)  from  now  on. 

To  effectively  utilize  such  spherical  wave  solutions,  we  need  to  understand  the  nature 
of  their  originating  singularity.  For  simplicity,  we  set  t0  =  0  in  (12.141)  and  concentrate 
on  the  particular  solution 

u(t,r)  =  S(r~ct)  ,  (12.142) 


which  apparently  has  a  bad  singularity  at  the  origin,  r  =  0,  at  the  initial  time  t  —  0.  We 
need  to  pin  down  precisely  which  sort  of  distribution  (generalized  function)  it  represents. 
Invoking  the  limiting  definition  is  tricky,  and  it  will  be  easier  to  work  with  the  dual  char¬ 
acterization  of  a  distribution  as  a  linear  functional.  Thus,  at  a  fixed  time  t  >  0,  we  must 
evaluate  the  inner  product  ^ 


u(t,  •),/>  =  y  y  y  u(t,  x,  y,  z)  f(x,  y,  z)  dx  dy  dz 

of  the  solution  with  a  smooth  test  function  /(x)  =  /(#,  y,  z).  We  rewrite  the  triple  integral 
in  spherical  coordinates,  whereby 


=  J 


7 r  pir  r  oo 


7T  J  0  J  0 


5(r  —  ct) 


r 


/(r,  (/?,  0)  r 2  sin  (p  dr  dtp  dO. 


When  t  0,  the  r  integration  can  be  immediately  evaluated,  and  so 


/7T  pi T 

/  f{ct,<p,9)  sin^  dipdO  =  4irct  Mct  [/] 

-7T  JO 


(12.143) 


where 


M  ct[f] 


1 


*7 r  r'K 


1 


47T 


f(ct,  if,  9)  sin^  dipde  =  - - ^  II  /  dS 

nJ  o  4:ircHzJJSt 


(12.144) 

is  the  mean  or  average  value  of  the  function  /  on  the  sphere  5ct  =  {||x||=ct}of  radius 
r  =  ct  and,  hence,  surface  area  Ittc2 I2 .  In  particular,  in  the  limit  as  the  sphere’s  radius 
ct  0,  by  continuity,  the  mean  reduces  to  just  the  value  of  the  function  at  the  origin: 


lim  M  t  [  /  ]  =  M0  [  /  ]  =  /(0) 

t  — y  oo 


(12.145) 


Thus,  (12.143)  implies  that 


lim  ( u(t,  •),/)  =  (  u( 0,  ■),/)  =  0 

t  — ¥  OO 


for  all  functions  /. 


For  fixed  t,  we  use  u(t,  •)  to  indicate  the  real- valued  function  (x,  y,  z)  ^  u(t,  x,  y,  z)  on  R3. 
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and  hence  u(0,x,y,  z)  =  0  represents  a  zero  initial  displacement.  In  other  words,  there  is, 
in  fact,  no  singularity  in  the  solution  at  t  =  0! 

In  the  absence  of  any  initial  displacement,  how,  then,  can  the  solution  (12.142)  be 
nonzero?  Clearly,  this  must  be  the  result  of  a  nonzero  initial  velocity.  To  evaluate  du/dt , 
we  differentiate  (12.143),  whereby 


du 

dt 


f 


d 


dt 


d 

dt 


‘IT  n r 


ct 


—  7 r  J  0 


=  C 


f(ct,ip,0)  sin</?  dp clB 

df 


/TV  f‘7T  f‘7T  f‘7T 

/  /(ct,  (/?,  6)  sin  ip  dtp  d6  +  c2 1  /  /  (ct,(p,0)  simp  dtp  d6 

-7 r  Jo  J-7T  Jo  Vr 


=  47 rc  Mct  [f]  +  47rc2 1  M 


ct 


d£ 

dr 


(12.146) 


The  result  is  a  linear  combination  of  the  means  of  /  and  its  radial  derivative  fr  over  the 
sphere  of  radius  ct.  In  the  limit,  the  second  term  goes  to  0,  and  so,  by  (12.145), 

lim  (ut ,  f )  =4ttcM0[/]  =4ttc/(0). 

L  r  U 

Since  this  holds  for  all  test  functions  /,  we  conclude  that  the  initial  velocity  of  our  solution 
is  a  multiple  of  a  delta  function  at  the  origin: 

ut{  0,  r)  =  4  7 rc5(x). 

Dividing  through  by  4 7 rc,  we  find  that  the  spherical  expanding  wave 

5{r  —  ct) 


n(t,  r) 


47rcr 


(12.147) 


solves  the  initial  value  problem 


7/(0,  x)  =  0. 


du 

dt 


(0,  x)  =  <5(x), 


corresponding  to  an  initial  unit-velocity  impulse  concentrated  at  the  origin.  This  solution 
can  be  viewed  as  the  three-dimensional  version  of  the  hammer-blow  solution  to  the  one¬ 
dimensional  wave  equation  discussed  in  Exercise  6.3.28. 

More  generally,  we  use  the  translational  symmetry  of  the  wave  equation  to  conclude 
that  the  function 


GV,x;£)  = 


41 

x  - 

■i\ 

—  ct ) 

4ttc 

x  ■ 

-41 

t  >  0, 


(12.148) 


is  the  fundamental  solution  to  the  wave  equation  resulting  from  a  unit-velocity  impulse 
concentrated  at  the  point  £  at  the  initial  time  t  —  0 : 


G(  0,x;O  =  0, 


dG 

~dt 


(0,x;  O  =  5(x.-£). 


(12.149) 


With  this  in  hand,  we  can  apply  linear  superposition  to  solve  the  zero  initial  displacement 
initial  value  problem 


u(0,x,y,z)  =  0, 


du 

dt 


(0 ,x,y,z)  =  g(x,  y,  z) 


(12.150) 
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Figure  12.9.  Cross-section  of  a  sphere  intersecting  a  ball. 


Namely,  we  write  the  initial  velocity 


s(x)  =  JJ  j  g(0  <5(x  -  £)  d£  dr]  d( 


as  a  superposition  of  impulses,  and  immediately  conclude  that  the  relevant  solution  is  the 
selfsame  superposition  of  spherical  waves: 


n(£,x)  = 


41 

lx-£  1 

l  - 

-  c  t  ) 

x  — 

ci 

1 

dt;  dr]  d( 


(12.151) 


4  7T  C2  t 


g(€)dS  =  tM*[g] 


-ct 


Thus,  the  value  of  our  solution  at  position  x  and  time  t  >  0  is  equal  to  t  times  the  mean 
of  the  initial  velocity  function  g  over  the  sphere  of  radius  r  =  ct  centered  at  the  point  x. 


Example  12.16.  Let  us  set  the  wave  speed  c—  1.  Suppose  that  the  initial  velocity 

<  1, 

>  1, 


5(x)  = 


f 

X 

1  0, 

X 

is  1  inside  the  unit  ball  Bx  centered  at  the  origin  and  0  outside.  To  solve  the  corresponding 
initial  velocity  problem,  we  must  compute  the  average  value  of  g  over  a  sphere 


stx 


=  {C  I  114; -x II  =t } 


of  radius  t  >  0  centered  at  a  point  x  G  R3.  Since  g  =  0  outside  the  unit  ball,  its  average 
will  be  equal  to  the  surface  area  of  that  part  of  the  sphere  that  is  contained  inside  the  unit 
ball,  namely  S*  H  B1,  divided  by  the  total  surface  area  of  Stx,  namely  4tt£2. 

To  compute  this  quantity,  let  r  =  ||  x  ||.  Ift>r  +  lorO<t<r— 1,  then  the  sphere 
of  radius  t  lies  entirely  outside  the  unit  ball,  and  so  the  average  is  0;  if  0  <  t  <  1  —  r,  which 
requires  r  <  1  and  so  x  E  Lq,  then  the  sphere  lies  entirely  within  the  unit  ball,  and  so  the 
average  is  1.  Otherwise,  referring  to  Figure  12.9  and  Exercise  12.6.7,  we  see  that  the  area 
of  the  spherical  cap  S tx  H  Bx  is  given  by 


27r£2(l  —  cos  a)  =  2  tv  t2  [  1  — 


r2  +  t2  —  1 
2  rt 


TV  t 


r 


1  —  (t  —  r) 


(12.152) 
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where  a  denotes  the  angle  between  the  line  joining  the  centers  of  the  two  spheres  and 
the  circle  formed  by  their  intersection,  whose  value  is  prescribed  by  the  Law  of  Cosines. 
Assembling  the  different  subcases,  we  conclude  that 


i 


M 


X 

ct  . 


9 


1  —  (£  —  r) 


Art 


0. 


r  —  1 1  <  t  <  r  +  1, 

0  <  £  <  r  —  1  or  £  >  r  +  1 


(12.153) 


The  solution  (12.151)  is  obtained  by  multiplying  by  £,  and  hence  for  £  >  0. 


(  £. 


0  <  £  <  1  - 


x 


u(t,  x)  < 


l-(t- 

X 

) 

4 

X 

X 


-  1  <  £  < 


X 


+  1? 


(12.154) 


0, 


0  <  £  < 


x 


1  or  £  > 


x 


The  resulting  function  is  not  smooth  at  the  interfaces  £  = 


x 


-  1 


+  1. 

and 


x 


+  1,  and 


hence  does  not  qualify  as  a  classical  solution.  Nevertheless,  it  can  be  shown  that  (12.154) 
is  a  bona  fide  weak  solution  to  the  initial  value  problem. 

The  first  two  rows  of  Figure  12.10  plot  the  solution  as  a  function  of  time  for  several 
fixed  values  of  r  =  ||  x  ||.  An  observer  sitting  at  the  origin  will  see  a  linearly  increasing 
light  intensity  followed  by  a  sudden  blackout.  At  other  points  inside  the  sphere,  there 
is  a  similar  linear  increase,  while  the  subsequent  decrease  follows  a  parabolic  arc;  if  the 
observer  is  closer  to  the  edge  of  the  ball  than  its  center,  the  parabolic  portion  will  continue 
to  increase  for  a  while  before  eventually  tapering  off.  On  the  other  hand,  an  observer  sitting 
outside  the  sphere  will  experience,  after  an  initially  dark  period,  a  symmetric  parabolic 
increase  to  a  maximal  intensity  and  then  a  decrease  back  to  dark  after  a  total  time  lapse 
of  2.  The  second  two  rows  plot  the  solution  as  a  function  of  r  for  various  fixed  times. 
Note  that,  up  until  time  £  =  1,  the  light  spreads  out  while  increasing  in  intensity  near  the 
origin,  after  which  the  solution  is  of  gradually  decreasing  magnitude,  supported  within  the 
domain  lying  between  two  concentric  spheres  of  respective  radii  £  —  1  and  £  +  1 . 


Returning  to  the  general  situation,  we  note  that  the  solution  formula  (12.151)  han¬ 
dles  only  nonzero  initial  velocities.  What  about  solutions  resulting  from  a  nonzero  initial 
displacement?  Surprisingly,  the  answer  comes  from  differentiation!  The  key  observation  is 
that  if  n(£,  x)  is  any  (sufficiently  smooth)  solution  to  the  wave  equation,  then  so  is  its  time 
derivative 

v(t, X)  =  ^  (; t , x).  (12.155) 

This  follows  at  once  from  differentiating  both  sides  of  the  wave  equation  with  respect  to  £ 
and  using  the  equality  of  mixed  partial  derivatives.  Physically,  this  implies  that  the  velocity 
of  a  wave  obeys  the  same  evolutionary  principle  as  the  wave  itself,  which  is  a  manifestation 
of  the  linearity  and  time-independence  (autonomy)  of  the  equation. 

Now,  suppose  u  has  initial  conditions 


u(0,x)  =  /(x),  «t(0,x)  =  g(x).  (12.156) 

What  are  the  initial  conditions  for  its  derivative  v  =  utl  Clearly,  its  initial  displacement 

i;(0,x)  =  ut(0,x)  =  g{x)  (12.157) 
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Figure  12.10. 


Wave  equation  solution  u(t,  r )  due  to 

an  initial  velocity  of  the  unit  ball. 


equals  the  initial  velocity  of  u.  As  for  its  initial  velocity,  we  have 


dv  d 2 


u 


dt  dt 2 


=  c2  Au, 


because  we  are  assuming  that  u  solves  the  wave  equation.  Thus,  at  the  initial  time,  the 
velocity, 

dv 

—  (0,  x)  =  c2  A«(0,  x)  =  c2  A/(x),  (12.158) 

equals  c2  times  the  Laplacian  of  the  initial  displacement  /.  In  particular,  if  u  satisfies  the 
initial  conditions 

'u(0,x)  =  0,  rq(0,  x)  =  g(x),  (12.159) 


then  v  —  ut  satisfies  the  initial  conditions 


v(0,x)  =  g(x) 


vt(0,x)  =  0. 


(12.160) 


Thus,  paradoxically,  to  solve  the  initial  displacement  problem  we  differentiate  the  initial 
velocity  solution  (12.151)  with  respect  to  t,  and  hence 


v(t,x) 


du 

dt 


d 


=Mct[9]+Ct  M 


X 

~ct 


dg_ 

dn 


(12.161) 
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where  we  have  made  use  of  our  computation  in  (12.146).  Therefore,  u(£,x)  is  a  linear 
combination  of  the  mean  of  the  function  g  and  the  mean  of  its  normal  or  radial  derivative 
dg/d n  =  dg/dr ,  taken  over  a  sphere  of  radius  ct  centered  at  the  point  x.  In  particular,  to 
obtain  the  solution  corresponding  to  a  concentrated  initial  displacement, 


F{  0,x;£)  =  <5(x  —  £), 

we  differentiate  the  solution  (12.148),  resulting  in 


dF 

~dt 


(0,x;£)  =  0, 


(12.162) 


C(t,x;0 


dG 

~dt 


(i,x;4) 


*'( 

X 

-41 

|  —  ct ) 

47T 

14- 

X 

(12.163) 


which  is  the  fundamental  solution  for  the  initial  displacement  problem.  Thus,  interestingly, 
a  concentrated  initial  displacement  spawns  a  spherically  expanding  doublet,  cf.  Figure  6.6, 
whereas  a  concentrated  initial  velocity  spawns  an  expanding  spherical  singlet  or  delta  wave. 


Example  12.17.  Let  c—  1.  Consider  the  initial  conditions 


qo,x)  =  /(x) 


<  i, 
>  i, 


du 

dt 


(0,  x)  =  0. 


(12.164) 


modeling  the  effect  of  an  instantaneously  illuminated  solid  ball.  To  obtain  the  resulting 
solution,  we  differentiate  (12.154)  with  respect  to  £,  leading  to 


0  <  t  <  1 


x 


u(t,  x)  < 


X 

—  t 

2 

X 

X 


- 1  <t< 


X 


T  1 


(12.165) 


l  0. 


0  <  t  < 


x 


—  1  or  t  >  1  + 


x 


As  illustrated  in  the  first  two  rows  of  Figure  12.11,  an  observer  sitting  at  the  center  of 
the  ball  will  see  a  constant  light  intensity  until  t  —  1,  at  which  time  the  solution  suddenly 
goes  dark.  At  other  points  inside  the  ball,  0  <  r  <  1,  the  downwards  jump  in  intensity 
arrives  sooner,  and  even  goes  below  0,  followed  by  a  further  linear  decrease,  and  finally 
a  jump  back  to  quiescence.  An  observer  placed  outside  the  ball,  at  radius  r  =  ||  x  ||  >  1, 
will  experience,  after  an  initially  dark  period,  a  sudden  increase  in  the  light  intensity  at 
time  t  —  r  —  1,  followed  by  a  linear  decrease  to  negative,  followed  by  a  jump  back  up  to 
darkness  at  time  t  =  r  +  1.  The  farther  away  from  the  source,  the  fainter  the  light.  In 
the  second  two  rows,  we  plot  the  same  solution  as  a  function  of  r  for  different  values  of 
t.  Note  the  sudden  appearance  of  a  1/r  singularity  at  the  origin  at  time  t  —  1,  which  is 
the  result  of  a  focusing  of  the  initial  discontinuities  of  u(0,x)  =  /(x)  on  the  surface  of  the 
unit  sphere.  Afterwards,  the  residual  radially  symmetric  disturbance  moves  off  to  oo  while 
gradually  decreasing  in  intensity.  Again,  the  discontinuities  imply  that  (12.165)  is  not  a 
classical  solution,  but  it  does  qualify  as  a  weak  solution  to  the  initial  value  problem. 


Kirchhoff ’s  Formula  and  Huygens *  Principle 

Linearly  combining  the  two  solution  formulas  (12.151)  and  (12.161)  establishes  Kirch- 
hoff’s  formula  (first  discovered  by  Poisson),  which  is  the  three-dimensional  counterpart  to 
d’Alembert’s  solution  formula  for  the  wave  equation. 
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r  —  0 


t  =  0 


Figure  12.11.  Wave  equation  solution  u(£,  r)  due  to 

an  initial  displacement  of  the  unit  ball  .  [+j 
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Theorem  12.18.  The  solution  to  the  initial  value  problem 

fh  i 

utt=c2Au,  n(0,x)  =  /(x),  —  (0,x)=5(x),  x  €  R3,  (12.166) 

for  the  wave  equation  in  three-dimensional  space  is  given  by 


=  §t  bMct  [/])  +  ^  Mct  [  5  ]  =  Mct  [  /  ]  +  c  i  M 


X 

ct 


2i 

dn 


+  t  M [ 5  ] ,  (12.167) 


where  Mrx  [  /  ]  denotes  the  average  of  the  function  f  over  the  sphere  S*t  =  {  ||  £  —  x 
of  radius  ct  centered  at  the  point  x. 


ct} 


A  crucially  important  consequence  of  the  Kirchhoff  solution  formula  is  a  celebrated 
physical  principle  first  set  out  by  the  pioneering  seventeenth  century  Dutch  scientist  Chris¬ 
tiaan  Huygens.'*'  Roughly,  Huygens ’  Principle  states  that,  in  three-dimensional  space, 
localized  solutions  to  the  wave  equation  remain  localized.  More  concretely,  (12.167)  im¬ 
plies  that  the  value  of  the  solution  at  a  point  x  and  time  t  depends  only  on  the  values  of 
the  initial  displacements  and  velocities  at  a  distance  ct  away.  Thus,  all  signals  propagate 
along  the  relativistic  light  cone 

c  t  —  x  +  y  T  z 

in  four-dimensional  Minkowski  space-time.  Physically,  Huygens’  Principle  assures  us  that 
any  light  that  we  witness  at  time  t  arrived  from  points  that  lie  a  distance  exactly  d  = 
c(t  —  t0)  away  at  an  earlier  time  t0  <  t.  In  particular,  a  localized  initial  signal,  whether 
initial  displacement  or  initial  velocity,  that  is  concentrated  near  a  point  produces  a  response 
that  remains  concentrated  on  an  ever  expanding  sphere  surrounding  the  point.  In  our  three- 
dimensional  universe,  we  witness  the  light  from  a  sudden  explosion  or  lightning  bolt  for  only 
a  brief  moment,  after  which  the  view  returns  to  darkness.  Similarly,  a  sharp  sound,  e.g., 
a  thunderclap,  remains  sharply  concentrated  with  diminishing  magnitude  as  it  propagates 
through  space.  Huygens’  Principle  is  responsible  for  the  important  astronomical  fact  that 
the  light  we  now  observe  from  a  distant  star  was  generated  at  a  single  past  time  that  is 
directly  proportional  to  the  star’s  distance  from  the  Earth.  Remarkably,  as  we  will  show  in 
the  following  subsection,  Huygens’  Principle  does  not  hold  in  a  two-dimensional  universe! 
There,  initially  concentrated  light  and  sound  impulses  will  spread  out  as  time  progresses, 
and  their  effect  will  be  experienced  over  an  extended  time  range;  see  below  for  details. 


Exercises 

12.6.1.  Solve  the  wave  equation  in  three-dimensional  space  for  the  following  initial  conditions: 
(a)  u(0,  x,  y,  z)  —  x  +  2,  iq(0,  x,  y:  z)  =  0;  (b)  u(0,  x,  y,  z)  —  0,  ut (0,  x,  y,  z)  —  y\ 

(c)  u{ 0,  x,  y,  z)  =  1/(1  +  x2  +  y2  +  z2),  ut( 0,  x,  y,  z)  =  0, 

(d)  u(0,  x,  y,  z)  =  0,  ut{ 0,  x,  y,  z)  =  1/(1  +  x2  +  y2  +  z2). 

12.6.2.  At  what  points  in  space-time  does  a  three-dimensional  wave  vanish  if  it  vanishes 
outside  a  sphere  of  radius  R  at  the  initial  time  t  =  0? 


Don’t  even  bother  trying  to  pronounce  his  name  correctly  unless  you  are  Dutch! 
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12.6.3.  Consider  the  initial  value  problem 

d2u  d2u  d2u  d2u  f 

o i o  "o  2  O  2  O  2  5  U/(0,  X, 

otA  oxA  oyA  ozA 


du 

~dt 


(0  ,x,y,z) 


1,  0  <  x,  y,  z  <  1, 

0,  otherwise, 


i.e.,  the  initial  velocity  is  1  inside  a  unit  cube  and  0  outside  the  cube.  We  interpret  the 
solution  u(t,  x,y,  z)  as  the  intensity  of  light  at  a  given  point  in  space-time,  measured  in 
units  that  make  the  speed  of  light  c—  1.  (a)  Write  down  an  integral  formula  for 
u(t,  x,y,  z).  (b)  Suppose  a  light  sensor  is  placed  at  the  point  (2,  2, 1).  For  which  values  of 

t  >  0  will  the  sensor  register  a  nonzero  signal?  Sketch  a  rough  graph  of  what  the  sensor 
measures.  (You  do  not  need  to  find  the  precise  formula,  but  explain  how  you  obtained  your 
graph.)  (c)  True  or  false:  The  solution  u(t,  x,y,z)  >  0  at  all  points  in  space-time. 

12.6.4.  Is  (12.151)  a  solution  to  the  wave  equation  for  t  <  0?  If  not,  write  down  a  solution 
formula  that  is  valid  for  negative  t. 


12.6.5.  True  or  false:  The  function  u(t,x,y,  z)  defined  by  (12.154)  is  everywhere  continuous. 

12.6.6.  A  thermonuclear  explosion  occurs  at  the  center  of  the  Earth.  Would  you  feel  the  effect 
first  through  a  motion  at  the  surface  or  a  change  in  temperature  at  the  surface?  Discuss. 

0  12.6.7.  Prove  that  the  area  of  the  spherical  cap  £tx  D  B1  is  given  by  formula  (12.152). 


Descent  to  Two  Dimensions 


So  far,  we  have  found  explicit  formulas  for  the  solution  to  the  wave  equation  on  the  one¬ 
dimensional  line,  and  in  three-dimensional  space.  The  two-dimensional  case 


utt  =  c2Au  =  c2(uxx  +  uVy)  (12.168) 

is,  counterintuitively,  more  complicated!  For  instance,  seeking  a  radially  symmetric  solution 
u(t,  r)  requires  solving  the  partial  differential  equation 


d2u 
dt 2 


(12.169) 


which,  unlike  its  three-dimensional  counterpart  (12.137),  is  not  so  easily  integrated. 

However,  onr  solution  to  the  three-dimensional  problem  can  be  adapted  to  construct 
a  solution  to  the  two-dimensional  problem  using  the  so-called  Method  of  Descent.  Observe 
that  any  solution  u(t,x,y)  to  the  two-dimensional  wave  equation  (12.168)  can  be  viewed 
as  a  solution  to  the  three-dimensional  wave  equation  (12.123)  that  does  not  depend  upon 
the  vertical  z  coordinate,  whence  du/dz  =  0.  Clearly,  if  the  three-dimensional  initial  data 
does  not  depend  on  z,  then  the  resulting  solution  u(t,  x,  y)  will  also  be  independent  of  z. 

Consider  first  the  zero  initial  displacement  condition 


u(0,x,y)=0,  —  (0  ,x,y)  =  g(x,y).  (12.170) 

In  the  three-dimensional  solution  formula  (12.151),  if  g(x,y)  does  not  depend  on  the  z- 
coordinate,  then  the  integrals  over  the  upper  and  lower  hemispheres 


sZt  =  { II  €  -  x 


=  ct,  c  >  T 


sct  =  {H-x 


=  ct,  C  <z}, 
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are  identical.  To  evaluate  these  integrals,  we  parametrize  the  upper  hemisphere  as  the 
graph  of 

£  =  z+^/c2t2  —  (£  —  x )2  —  (77  —  y )2  over  the  disk  =  {  (£  —  x)2  +  (77  —  y )2  <  c2t2  }, 

concluding  that 


u(t,x,y ) 


1 


4ttc2  t 

1 


#(£,77)  dS 


1 


5 


c  t 


2  7 rc2 1 

g(Zv) 


S 


c  t 


g(Zv)  ds 


d £  dr) 


(12.171) 


2 ncJJDx  y/c2t2  -  (|  -  a;)2  -  (77  -  y)2 
solves  the  initial  value  problem  (12.170).  In  particular,  if  we  take  the  initial  velocity 


(0,  x,  y)  =  g(x,  y)  =  S(x)  S(y) 


to  be  a  unit  impulse  concentrated  at  the  origin,  then  the  resulting  solution  is 


-  -  ^  x2  +  y2  <  c2t2, 

u(t,x,y)=l  2 7T c  \J c2 12  —  x2  —  y2  ’  ’  (12.172) 

l  0,  x2  +  y2  >  c2t2 . 

An  observer  sitting  at  distance  r  =  ||  x  ||  =  x2  +  y2  from  the  origin  will  first  witness 
a  concentrated  displacement  singularity  at  time  t  =  r/c.  However,  in  contrast  to  the 
three-dimensional  solution,  even  after  the  impulse  passes  by,  there  will  continue  to  be  a 
decreasing,  but  nonzero,  signal  of  magnitude  roughly  proportional  to  l/t.  In  Figure  12.12, 
we  plot  the  solution  (12.172)  for  unit  wave  speed  c—  1.  The  first  row  plots  intensity  as  a 
function  of  t  at  three  different  radii;  note  that  the  initial  singularity,  indicated  by  a  spike  in 
the  graph,  is  followed  by  a  progressively  smaller  residual  displacement,  which  never  entirely 
disappears.  The  second  row  shows  the  displacement  at  three  different  times  as  a  function 
of  r  —  ||  x  || . 

As  in  the  three-dimensional  case,  the  solution  to  the  initial  displacement  conditions 


u(0,x,y)  =  f(x,y), 


du 

~dt 


(0  ,x,y)  =  0, 


can  then  be  obtained  by  differentiation  of  (12.171)  with  respect  to  t,  and  so 


(12.173) 


u(t,x,y ) 


1  d_ 

2 7 rc  dt 


_ _ 

yc2 12  -  (f  -  x)2  -  Jr)  -  yf 


d £  dy. 


(12.174) 


As  before,  starting  with  a  concentrated  impulse,  an  observer  will  witness,  after  a  time 
lapse  t  =  r/c,  an  abrupt  impulse  passing  by,  followed  by  a  progressively  decaying  residual 
wave.  The  general  solution  to  the  two-dimensional  wave  equation  on  all  of  IR2  is  a  linear 
combination  of  these  two  types  of  solutions  (12.171, 174). 

As  a  consequence  of  these  considerations,  we  discover  that  Huygens’  Principle  is  not 
valid  in  a  two-dimensional  universe.  The  solution  to  the  two-dimensional  wave  equation 
at  a  point  x  at  time  t  depends  on  the  initial  displacement  and  velocity  on  the  entire  disk 
of  radius  ct  centered  at  the  point,  and  not  just  on  the  points  lying  a  distance  ct  away. 
So  a  two-dimensional  creature  would  experience  not  just  an  initial  effect  of  a  concentrated 
sound  or  light  wave,  but  also  an  “afterglow”  of  slowly  diminishing  magnitude.  It  would  be 
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Figure  12.12. 


Solution  to  the  two-dimensional  wave  equation  (+J 
for  a  concentrated  impulse. 


like  living  in  a  permanent  echo  chamber,  and  so  understanding  and  acting  upon  sensory 
phenomena  would  be  considerably  more  challenging.  In  general,  it  can  be  proved  that 
Huygens’  Principle  for  the  wave  equation  is  valid  only  in  spaces  of  odd  dimension  n  = 
2k  +  1  >  3;  see  also  [15]  for  recent  advances  in  the  classification  of  partial  differential 
equations  that  admit  a  Huygens’  Principle. 

Remark :  Since  the  solutions  to  the  two-dimensional  wave  equation  can  be  interpreted 
as  three-dimensional  solutions  with  no  z  dependence,  a  concentrated  delta  impulse  for  the 
two-dimensional  wave  equation  would  correspond  to  a  three-dimensional  initial  impulse 
that  is  concentrated  along  an  entire  vertical  line,  e.g.,  an  instantaneous  lightning  bolt  in 
the  form  of  an  infinite  straight  line.  An  observer  fixed  in  space  will  first  encounter  the 
light  flash  arriving  from  the  closest  point  on  the  line,  but  will  subsequently  experience  the 
gradually  decreasing  effect  of  the  light  emitted  by  points  that  lie  progressively  farther  away 
along  the  line.  This  accounts  for  the  two-dimensional  afterglow  in  formula  (12.172). 


Exercises 

12.6.8.  Solve  initial  value  problem  for  the  two-dimensional  wave  equation  with  the  following 
initial  data  (a)  a(0,  x,  y)  =  x  —  y,  ut  (0,  x,  y)  =  0;  (b)  a(0,  x,  y)  =  0,  iq(0,  x,  y)  =  y. 
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12.6.9.  (a)  Prove  that  u(t,  x,y)  =  1  / x2  -\-  y2  —  c2t 2  is  a  solution  to  the  two-dimensional  wave 

o  o  o  o 

equation  on  the  domain  D  =  { x  +  y  >  c  t  }  exterior  to  the  light  cone  passing  through 
the  origin.  What  is  the  corresponding  initial  data  at  t  =  0?  (b)  Use  part  (a)  to  solve  the 

initial  value  problem  u(0,x,y)  =  0,  ut(0,x,y)  =  1  /  \J x2  -\-  y2  ,  on  D. 

o 

12.6.10.  Consider  the  two-dimensional  wave  equation  on  R  with  wave  speed  c  —  1.  Write 
down  an  integral  formula  for  the  solution  to  the  following  initial  value  problems.  You  need 

not  evaluate  the  integrals,  (a)  u( 0,  x,  y)  —  x3  —  y3 ,  ut( 0,  x,  y)  —  0; 

(b)  u(0,  x,  y)  =  0,  ut(0,x,y)  =  y2;  (c)  u(0,  x,  y)  =  x2  +  y2 ,  ut(0,  x,  y)  =  -x2  -  y2 . 

12.6.11.  (a)  Find  the  solution  to  the  two-dimensional  wave  equation  whose  initial  displacement 
is  a  concentrated  delta  impulse  at  the  origin  and  whose  initial  velocity  is  zero. 

(b)  Is  your  expression  a  classical  solution  when  t  >  0? 

(c)  True  or  false:  The  solution  tends  to  0  uniformly  as  t  oo. 

12.6.12.  Use  separation  of  variables  to  write  down  an  eigenfunction  series  solution  to  the 
partial  differential  equation  (12.169)  when  subject  to  homogeneous  Dirichlet  boundary 
conditions  at  r  =  1  and  bounded  at  r  =  0. 

0  12.6.13.  Write  down  the  fundamental  solution  for  the  one-dimensional  wave  equation  with 
(a)  a  concentrated  initial  displacement  at  the  origin;  (b)  a  concentrated  initial  velocity 
at  the  origin,  (c)  Discuss  the  validity  of  Huygens’  Principle  in  a  one-dimensional  universe. 

12.6.14.  Discuss  how  you  can  construct  solutions  to  the  one-dimensional  wave  equation  by 
descent  from  the  three-dimensional  wave  equation. 


12.7  The  Hydrogen  Atom 

A  hydrogen  atom  consists  of  a  single  electron  circling  an  atomic  nucleus  that  contains  a 
single  proton,  which,  owing  to  its  relatively  tiny  size,  is  assumed  to  be  entirely  concentrated 
at  the  origin.  As  a  result  of  quantization  of  the  corresponding  classical  Coulomb  problem, 
the  Schrodinger  equation^  governing  the  dynamical  behavior  of  the  electron  moving  around 
the  nucleus  takes  the  explicit  form 

.  dip  h2  a2  h2  (  d2p)  d2p)  d2f)  \  a2  ip 

dt  2 M  r  2M  \  dx2  +  dy 2  +  dy 2  )  x 2  +  y2  +  z2 

(12.175) 

Here  ^(£,  x:  y,  z)  denotes  the  electron’s  time-dependent  wave  function,  which,  at  each  time 
£,  prescribes  its  quantum  probability  density  as  it  circles  the  nucleus.  In  the  quantized 
Hamiltonian  operator  K  =  —  |  {h2 /M)  A  —  <a2/r,  the  coefficient  of  the  Laplacian  depends 
on  Planck’s  constant  h  and  the  electron’s  mass  M .  The  final  term  represents  the  three- 
dimensional  electromagnetic  (Coulomb)  potential  function  U(x)  =  a2/r  attracting  the 
electron  to  the  nucleus,  with  a  representing  the  electron’s  (and  proton’s)  charge,  while  r  = 

|  x  ||  is  its  distance  from  the  nucleus.  Incidentally,  the  quantum-mechanical  Schrodinger 
equation  for  multi-electron  atoms  or  even  molecules  is  not  so  difficult  to  write  down,  but 
its  solution,  even  for,  say,  the  helium  atom,  is  much  more  difficult,  and,  in  general,  is  still 


t  The  reader  is  referred  to  (9.151)  and  the  subsequent  discussion  for  generalities  regarding  the 
Schrodinger  equation  and  quantum  mechanics. 
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a  major  challenge  for  numerical  analysts,  even  on  today’s  supercomputers,  [116].  Thus, 
to  keep  matters  as  simple  as  possible,  we  will  consider  only  the  case  of  a  single  electron 
hydrogen  atom  here. 


Bound  States 


According  to  the  analysis  in  Section  9.5,  the  normal  mode  solutions  to  the  Schrodinger 
equation  are  of  the  form 

Wj  x,  y,  z)  =  elXt/hv(x,y,z), 

where  v  is  an  eigenfunction  of  the  Hamiltonian  operator  with  eigenvalue  A,  and  hence 
satisfies 

h2  (  a2  \ 

Av  +  (  A  +  —  )v  =  0.  (12.176) 

2  M  \  r  J  v  7 

The  bound  states  of  the  atom,  in  which  the  electron  remains  trapped  by  the  nucleus,  are 
represented  by  the  nonzero  solutions  to  the  eigenvalue  problem  (12.176)  with  unit  L2  norm: 


v 


IIJ  \  Vi z)  I2  dx  dy  dz  =  i- 


The  eigenvalue  A  specifies  the  bound  state’s  energy,  and  is  necessarily  negative:  A  <  0. 
Since  we  are  working  on  an  unbounded  domain,  the  bound  states  do  not  form  a  complete 
system  of  eigenfunctions,  and  so  not  every  wave  function  p  E  L2(IR3)  can  be  approximated 
by  an  eigenfunction  series.  The  missing  data  are  the  so-called  scattering  states  arising 
from  the  continuous  spectrum  of  the  Schrodinger  operator;  these  represent  electrons  that 
scatter  off  the  nucleus,  and  so  do  not  remain  trapped  in  an  orbit.  (For  the  classical  Kepler 
problem  of  a  planet  circling  a  sun,  the  bound  states  would  correspond  to  planets  following 
bounded  elliptic  orbits,  while  the  scattering  states  correspond  to  interstellar  comets  and 
the  like  moving  along  unbounded  hyperbolic  or  parabolic  trajectories.)  We  will  leave 
the  discussion  of  the  quantum-mechanical  scattering  states  and  the  associated  continuous 
spectrum  to  a  more  advanced  treatment  of  the  subject,  [72,  95  . 

To  understand  the  bound  states,  we  will  apply  the  method  of  separation  of  variables. 
We  begin  by  rewriting  the  eigenvalue  problem  (12.176)  in  spherical  coordinates: 


h2  f  d2v  2  dv  1  d2v  cos  p  dv  1  d2v 

2  M  \  dr 2  r  dr  +  r2  dp2  r2  sin  p  dp  r2  sin2  p  d62 

We  then  separate  off  the  radial  coordinate,  setting 


+ 


v  =  0. 
(12.177) 


v(r,  p ,  9)  =  p(r)  w(p,  9). 

The  angular  component  satisfies  the  spherical  Helmholtz  equation 


A  s  w  +  p  w 


d2w  cos  p  dw  1  d2w 

TT”2  +  - - 7) - ^  ~~7 —  Tun  +  tl  w  =  0? 

dpz  sm  p  dp  sm  p  d9z 


which  we  have  already  solved;  see  (12.21)  and  the  ensuing  discussion.  The  eigensolutions 
are  spherical  harmonics,  which,  because  the  quantum- mechanical  solutions  are  intrinsically 
complex- valued,  we  take  in  their  complex  form  (12.46).  The  associated  eigenvalue 


p  —  l  (Z  +  1) , 


where  the  integer 


Z  =  0, 1, 2, _ 


(12.178) 
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is  known  as  the  angular  quantum  number ,  admits  a  total  of  21  +  1  linearly  independent 
eigenfunctions 


yr(<P’d)  =  ^(cos^e 


i  rn  6 


m  =  — Z,  — Z  +  1, . . . ,  Z  —  1,  Z. 


(12.179) 


The  radial  equation  with  the  separation  constant  (12.178)  is 


h2  (  d2p  2  dp 
2  M  y  dr2  ^  r  dr 


+  (  A+ - 

r 


cr  l  (l  +  1) 


ry*  a 


p  =  0. 


(12.180) 


To  eliminate  the  physical  parameters,  let  us  rescale  the  radial  coordinate  by  setting 


s  =  cr  r, 


where 


cr  = 


2  V-2MA 

h 


(12.181) 


given  that  A  <  0.  The  resulting  ordinary  differential  equation  for  the  rescaled  function 


P{s)=p(^) 


is 


d2P  2  dP  (  1  n  Z(Z  +  1)  ,  „ 

77+-T-|t--+  V  „  7  1  P  =  0, 

dsz  s  ds 


(12.182) 


where 


2  Mo?  o? 


n  — 


a 


h2 


h 


M 

2A 


(12.183) 


Equation  (12.182)  is  a  version  of  the  generalized  Laguerre  differential  equation  —  see  Ex¬ 
ercise  12.7.4  below  —  named  after  the  nineteenth-century  French  mathematician  Edmond 
Laguerre,  who  studied  its  solutions  well  before  the  appearance  of  quantum  mechanics.  Since 
we  are  searching  for  bound  states,  the  relevant  solutions  should  be  defined  on  0  <  s  <  oo, 
remain  bounded  at  s  =  0,  and  go  to  zero  as  s  oo: 


lim  P(s)  <  oo, 

s  — >  0+ 


lim  P(s)  =  0. 

S  — >  CXD 


(12.184) 


The  proof  of  the  following  key  result  is  outlined  in  Exercises  12.7.4-5. 


Theorem  12.19.  For  each  pair  of  nonnegative  integers  0  <  l  <  n,  the  boundary 
value  problem  (12.182, 184)  has  the  eigensolution 


PPs)  =  sle-s/2L2nl+1_1(s) 


(12.185) 


where 


go  = 


s  J  e 


jes  dk 


k 


k\  dsk  1 


sj+ke~s 


j,k  =  0,1,2,...  ,  (12.186) 


are  known  as  generalized^  Laguerre  polynomials. 


^  The  ordinary  Laguerre  polynomials  are  Lk(s)  —  Lk(s). 
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Figure  12.13.  Generalized  Laguerre  polynomials. 


The  first  few  generalized  Laguerre  polynomials  are 


L°0{s) 

L^s) 

L20(s) 


=  1 
=  1 
=  1 


L?(s) 
L\ (s) 
L\{s) 


1  -  s, 

2  —  s, 

3  —  s. 


1 

3 

6 


2  s  +  \  s2 


L°2(s) 

L\(s)  =  3  -  3s  +  \s2 
L&s) 


4s  +  ^  s2 


L°s(s) 

Ll(s) 

L\{s) 


=  1_3s+|s2_  153? 

=  4  —  6s  +  2s2  —  |s3, 

=  10  —  10  s  +  |  s2  —  |  s3 


Note  that  L3k(s)  has  degree  k.  A  few  graphs,  on  the  interval  0  <  t  <  6,  appear  in 
Figure  12.13.  See  [86]  for  details  on  their  properties. 


Atomic  Eigenstates  and  Quantum  Numbers 


The  integer  n,  whose  physical  value  was  noted  in  (12.183),  is  known  as  the  principal 
quantum  number.  We  further  note  that  the  scaling  factor  in  (12.181)  can  be  written  as 


a  — 


2M  a2 
n  h2 


na 


where 


h2 


a  = 


Ma‘ 


.529  x  10  10  meter. 


which  approximates  the  radius  of  the  electron’s  lowest  energy  level,  is  known  as  the  Bohr 
radius ,  in  honor  of  the  pioneering  Danish  quantum  physicist  Niels  Bohr.  Reverting  to  phys- 
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ical  coordinates,  the  bound  state  solutions  (12.185)  become,  up  to  an  inessential  constant 
multiple,  the  radial  wave  functions 

<i2-i87> 

Combining  them  with  the  spherical  harmonics  (12.179)  yields  the  atomic  eigenfunctions 
or  eigenstates 


Vlmn(r^X) 


(21  +  1)  (l  —  m) !  (n  —  l  —  1) ! 
ira3n4(l  +  m) !  (/  +  n) ! 


/TOrrW), 


(12.188) 


where  the  initial  factor  is  selected  so  as  to  make  ||  vimn  II  =  and  hence  a  bona  fide  wave 
function.  (A  proof  of  this  fact  is  outlined  in  Exercise  12.7.8.)  The  eigenstates  depend  on 
three  integers,  which  have  the  following  physical  designations: 

•  71  =  1,2,3,...:  the  principal  quantum  number ; 

•  Z  =  0,1,  ...  ,77—1:  the  angular  quantum  number ; 

•  777  =  —/,—/+  1,  ...  ,  Z  —  1,  Z :  the  magnetic  quantum  number. 

The  energy  is  the  associated  eigenvalue: 


cCM  1  _  a2  1 
2  h2  772  2  a  772  ’ 


(12.189) 


The  fact  that  the  ratios  An/Ax  =  1/n2  between  the  energy  levels  of  an  atom  are  inverse 
squares  of  integers  was  one  of  the  key  experimental  discoveries  that  precipitated  the  dis¬ 
covery  of  quantum  mechanics.  Observe  that  the  77th  energy  level  has  a  total  of 


n—  1 

^2(21  +  1)  =n2  (12.190) 

1  =  0 

linearly  independent  bound  states  (12.188).  The  dimension  of  the  eigenspace  corresponds  to 
the  number  of  orbital  subshells  in  the  atom  for  the  corresponding  energy  level.  The  shells 
indexed  by  the  angular  quantum  number,  i.e.,  the  order  l  =  0,1,2,...  of  the  spherical 
harmonic,  are  traditionally  labeled  by  a  letter  in  the  sequence  s,_p,  d,  /,  g, . . .  ,  where  each 
successive  shell  contains  21  +  1  individual  subshells,  indexed  by  the  magnetic  quantum 
number  m. 

The  one  missing  ingredient  in  this  simple  model  is  the  electron’s  spin.  Since  electrons 
can  have  one  of  two  possible  spins,  the  Pauli  Exclusion  Principle ,  first  formulated  by  the 
Austrian  physicist  Wolfgang  Pauli,  tells  us  that  each  atomic  energy  shell  can  be  occupied 
by  at  most  two  electrons.  Consequently,  the  atomic  shell  with  angular  quantum  number  l 
may  contain  up  to  2(21  +  1)  electrons.  Keep  in  mind  that,  since  0  <  l  <  77,  the  Zth  shell 
appears  only  when  n  is  sufficiently  large,  so  that,  according  to  (12.190),  the  77th  energy 
level  contains  up  to  2n2  electrons. 

The  resulting  atomic  configuration  of  electronic  energy  shells  is  the  explanation  for 
Mendeleev’s  periodic  table.  Its  rows  are  indexed  by  the  principal  quantum  number  77, 
while  the  columns  are  labeled  by  the  angular  and  magnetic  quantum  numbers  /,777,  and 
the  spin.  As  one  moves  up  the  periodic  table,  the  electrons  in  each  successive  element’s 
atom  progressively  fill  up  the  lower  energy  levels,  each  new  shell  containing  first  a  single 
electron,  then  two  electrons  with  opposite  spins.  Thus,  hydrogen  (in  its  ground  state)  has 
a  single  electron  in  the  Is  shell.  Helium  has  two  electrons  in  the  Is  shell.  Lithium  has 
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three  electrons,  with  two  of  them  filling  the  Is  shell  and  the  third  in  the  2s  shell.  Neon 
has  ten  electrons  filling  the  first  two  energy  levels,  with  two  electrons  in  the  Is  shell,  two 
in  the  2s  shell,  and  six  in  the  2 p  shell.  And  so  on.  The  one  complication  is  that,  owing 
to  the  orbital’s  geometry,  as  prescribed  by  the  associated  spherical  harmonic,  the  angular 
and,  to  a  lesser  extent,  magnetic  quantum  numbers  also  affect  the  physically  observed 
energy,  and  this  can  cause  shells  to  fill  later  than  might  initially  be  expected.  For  example, 
in  potassium  and  calcium,  the  4s  shell  is  successively  filled,  followed  by  scandium,  which 
begins  the  process  of  filling  the  3d  subshells.  The  chemical  properties  of  the  elements  are, 
to  a  very  large  extent,  determined  by  the  placement  of  their  atom’s  electrons  within  the 
outermost  energy  level.  The  interested  reader  can  consult,  for  example,  [67,  79]  for  further 
details. 


Exercises 


12.7.1.  If  the  nucleus  contains  Z  protons  circled  by  a  single  electron,  then  its  atomic  potential 
F(x)  is  rescaled  accordingly,  replacing  o?  /r  by  Z  o? /r.  Discuss  the  induced  effect  on  the 
energy  levels  of  such  an  atomic  ion. 

T  12.7.2.  (a)  Write  down  the  time-dependent  wave  function  for  a  single  electron  atom  when  the 
electron  is  in  its  ground  state,  i.e. ,  the  lowest  energy  level,  (b)  What  is  the  probability 
density  of  the  electron?  (c)  What  is  the  probability  of  finding  the  electron  within  1  Bohr 
radius  of  the  atom?  (d)  Find  the  distance  d  (measured  in  Bohr  radii)  so  that  there  is  a 
95%  probability  of  finding  the  electron  within  a  distance  d  of  the  nucleus. 


0  12.7.3.  Prove  that  the  two  expressions  for  the  Laguerre  polynomials  in  (12.186)  agree. 

0  12.7.4.  (a)  Let  k  =  0, 1,  2, . . .  be  a  nonnegative  integer.  The  Laguerre  differential  equation  of 
order  k  is 

xu'  -\- {1  —  x)  u  -\- ku  =  3.  (12.191) 

Show  that  x  =  0  is  a  regular  singular  point.  Then  prove  that  the  Frobenius  solution  based 
at  x  =  0  is  a  polynomial  of  degree  j  that  coincides  with  the  Laguerre  polynomial  Lk(x). 

(b)  Given  nonnegative  integers  j,  k  >  0,  use  the  Frobenius  method  to  prove  that  the 
generalized  Laguerre  differential  equation 

x  u"  +  [j  +  1  —  x)  u  +  k  u  =  0  (12.192) 

has  a  polynomial  solution  that  can  be  identified  with  the  generalized  Laguerre 
polynomial  L3k(x)  in  (12.186). 

0  12.7.5.  Suppose  that  P(s)  solves  the  ordinary  differential  equation  (12.182).  Prove  that 
Q(s)  —  s~les/2P(s)  solves  the  differential  equation 

5  Lf  +  [2(l  +  i)-s[fl  +  (n-i-i)Q  =  o.  (12.193) 

Then  apply  the  result  of  Exercise  12.7.4  to  complete  the  proof  of  Theorem  12.19. 


T  12.7.6.  Suppose  f{x)  is  a  polynomial,  and  let  L3k{s)  denote  the  generalized  Laguerre 
polynomials  (12.186).  (a)  Prove  that,  for  j,k  >  0, 


■oo 


r0 


/ (s)  LJk(s)  sJ  e  s  ds 


‘OO 


f^k\s)  s3^1^  e  b  ds. 


j+k 


k\ 


r0 


(b)  For  fixed  j,  prove  that  the  generalized  Laguerre  polynomials  L3k(s),  k  =  0, 1,  2,  . . .  , 


are  orthogonal  with  respect  to  the  weighted  inner  product  (f,g) 


‘OO 


r0 


f(s)  g(s)  s°  e  s  ds. 


(c)  Prove  the  formula  for  their  corresponding  norms:  ||  LJk  ||  = 


U  +  *0 ! 


k\ 
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0  12.7.7.  (a)  Prove  that  the  generalized  Laguerre  polynomials  satisfy  the  following  recurrence  re¬ 
lation: 

(k  +  1)  —  (j  +  2 k  +  1  —  s)  L^(s)  +  (j  +  k)  L|,_1(s)  =  0.  (12.194) 

(b)  Prove  that 

1 2  -  (j  +  2fc  +  1)  (j  +  k) ! 


■OO 


0 


j  + 1  —  s 

s-7  e 


Li(s) 


ds  = 


A;! 


(12.195) 


Hint:  Use  part  (a)  and  Exercise  12.7.6. 


^ ?  12.7.8.  Prove  that  the  atomic  eigenfunctions  (12.188)  form  an  orthonormal  system  of  wave 

o  o 

functions  with  respect  to  the  L  inner  product  on  R  . 

Hint :  Use  Theorem  9.33  and  equation  (12.195). 


Appendix  A 

Complex  Numbers 


The  purpose  of  this  short  appendix  is  to  review  the  basics  of  complex  numbers  and  complex 
arithmetic,  which  are  used  throughout  much  of  the  text. 

A  complex  number  is  an  expression  of  the  form  z  =  x  +  i  y,  where  are  real 

and  i  =  is  the  imaginary  unit.  The  set  of  all  complex  numbers  is  denoted  by  C.  We 
call  x  =  Kez  the  real  part  of  £  and  y  =  Imz  the  imaginary  part  of  £  =  x  +  iy.  (Note:  The 
imaginary  part  is  the  real  number  y,  not  i  y.)  A  real  number  x  is  merely  a  complex  number 
with  zero  imaginary  part,  Im  z  =  0,  and  so  we  may  regard  IcC.  Complex  addition  and 
multiplication  are  based  on  simple  adaptations  of  the  rules  of  real  arithmetic  to  include 


the  identity 


2  _ 


—  1,  and  so 


(A.l) 


(x  +  i  y)  +  (u  +  i  v)  =  (x  +  u)  +  i  (y  +  v), 

(x  +  iy)  (u  +  iu)  =  (xu  —  yv)  +  i  (xv  +  yu). 

Complex  numbers  enjoy  all  the  usual  laws  of  real  addition  and  multiplication,  including 
commutativity :  zw  =  wz. 

We  can  identify  a  complex  number  x  +  iy  with  a  vector  (x,y)  E  M2  in  the  real, 
two-dimensional  plane.  For  this  reason,  C  is  sometimes  referred  to  as  the  complex  plane. 
(Although  keep  in  mind  that,  as  a  complex  vector  space,  C  is  only  one-dimensional.)  Based 
on  this  identification,  we  shall  employ  the  standard  terminology  of  planar  vector  calculus 
-  domain,  curve,  etc.  —  without  alteration.  Complex  addition  (A.l)  corresponds  to  vector 
addition,  but  the  vector  interpretation  of  complex  multiplication  is  more  obscure. 

The  complex  conjugate  of  z  =  x  +  iy  is  z  =  x  —  i  y.  Note  that  Re  ~z  —  Re  z,  while 
Im  ~z  —  —  Im  z.  Geometrically,  the  complex  conjugate  of  z  is  obtained  by  reflecting  the 
corresponding  vector  through  the  real  axis,  as  illustrated  in  Figure  A.l.  In  particular, 
~z  —  z  if  and  only  if  z  is  real.  In  general, 


R  e  z  = 


z  -h  z 


Im  £  = 


z  —  z 


2  2  i 

Complex  conjugation  is  compatible  with  complex  arithmetic: 


(A.2) 


£  +  w  =  z  +  re,  zw  —  zw. 

In  particular,  the  product  of  a  complex  number  and  its  conjugate. 

(x  +  iy)  (x  -  iy)  =  x2  +  y2, 


z  z 


(A.3) 


is  real  and  nonnegative.  Its  square  root  is  known  as  the  modulus  or  norm  of  the  complex 
number  z  =  x  +  iy,  and  written 


=  \/x2  +  y2 


(A.4) 
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Figure  A.l.  Complex  numbers. 


Note  that  |  2  |  >  0,  with  \z  \  =  0  if  and  only  if  z  =  0.  The  modulus  |  z  |  generalizes  the 
absolute  value  of  a  real  number  and  coincides  with  the  standard  Euclidean  norm  in  the 
(x,  y)- plane.  This  implies  the  validity  of  the  triangle  inequality 


z  +  w  < 


z 


+ 


w 


(A.5) 


Equation  (A. 3)  can  be  rewritten  in  terms  of  the  modulus  as 


z  z  — 


z 


(A.6) 


Rearranging  the  factors,  we  deduce  the  formula  for  the  reciprocal  of  a  nonzero  complex 
number: 


1 

z 


z 


z 


z  /  0,  or,  equivalently, 


x  —  ly 


x  +  iy  x2  +  y 


2  • 


(A.7) 


The  general  formula  for  complex  division. 


w 

z 


w  z 


z 


or 


u-\-iv  (xu  +  yv)  +  i  (xv  —  yu) 


x  +  iy 


x2  +  y[ 


(A.8) 


is  an  immediate  consequence. 

The  modulus  of  a  complex  number, 


r  = 


z 


=  V  x2  +  2/- 


is  one  component  of  its  polar  coordinate  representation 


x  —  r  cos  6. 


y  —  r  sin  9 


or 


z  =  r(cos#  +  i  sin#) 


(A.9) 


The  polar  angle  #,  which  measures  the  angle  that  the  line  connecting  z  to  the  origin  makes 
with  the  horizontal  axis,  is  known  as  the  phase ,  and  written 


6  =  ph  z. 


(A. 10) 


As  such,  the  phase  is  defined  only  up  to  an  integer  multiple  of  27r.  The  unique  principal 
value  of  the  phase  is  restricted  to  —  7r  <  phz  <  7r.  A  more  common  term  for  the  polar 
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angle  is  the  argument  of  z,  written  arg  z  =  phz.  However,  in  conformity  with  [85,  86],  we 
prefer  to  use  “phase”  here,  in  part  to  avoid  confusion  with  the  argument  z  of  a  function 

/(*)• 

Euler’s  celebrated  formula  for  the  complex  exponential, 


el°  =  cos 9  +  i  sin 0, 
can  be  used  to  compactly  rewrite  the  polar  form  (A. 9)  of  a  complex  number  as 


(A.ll) 


z  =  r  e 


i  6 


where 


r  = 


z 


9  =  phz. 


(A.!2) 


Consequently,  the  complex  logarithm  has  the  form 


log  z  =  log(r  e 1 e)  =  log  r  +  log  e 1  u  =  log  r  +  i  9  =  log  \z\  +  i  phz 


i  6 


(A. 13) 


More  generally,  the  complex  exponential  is  given  by 


ex  cos  y  +  i  e*  sin  y 


X 


for 


z  =  x  -\-  i  y. 


(A.14) 


We  note  that  the  modulus  and  phase  of  a  product  of  complex  numbers  can  be  readily 
computed: 


zw 


z 


w 


ph  (zw)  =  phz  +  phw. 


(A. 15) 


the  latter  formula  requiring  that  we  allow  multiply  valued  phases;  the  formula  does  not 
hold  as  stated  for  all  £,  w  when  the  principal  value  of  the  phase  is  used.  Similarly,  the 
modulus  and  phase  of  the  reciprocal  of  a  nonzero  complex  number  are 


1 

£ 


£ 


ph  (  -  )  =  —  ph  z. 


(A. 16) 


On  the  other  hand,  complex  conjugation  preserves  the  modulus,  but  negates  the  phase: 


z 


z 


phz  =  —  phz. 


(A.17) 


The  latter  formula  is  not  valid  for  the  principal  value  of  the  phase  when  z  lies  on  the 
negative  real  axis. 


Appendix  B 

Linear  Algebra 


In  this  appendix,  we  collect  basic  results  and  definitions  from  linear  algebra  that  are  used 
in  our  study  of  partial  differential  equations.  The  reader  is  referred  to  [89]  for  the  proofs 
and  further  details. 

B.l  Vector  Spaces  and  Subspaces 

Vector  spaces  and  their  ancillary  structures  provide  the  common  language  of  linear  alge¬ 
bra.  The  basic  definition  is  modeled  on  the  prototypical  finite-dimensional  example:  the 
Euclidean  space  IRn,  which  is  the  set  of  all  real  (column)  vectors  with  n  entries,  equipped 
with  the  operations  of  vector  addition  and  scalar  multiplication.  More  generally: 

Definition  B.l.  A  (real)  vector  space  is  a  set  V  equipped  with  two  operations: 

(z)  Addition :  adding  any  pair  of  elements  v,  w  E  V  produces  another  vector  v  + w  E  V. 

( ii )  Scalar  Multiplication :  multiplying  an  element  v  E  V  by  a  scalar  c  E  M  produces  a 
vector  cv  E  V. 

These  are  subject  to  the  following  axioms:  for  all  u,  v,  w  E  V  and  all  scalars  c,  d  E  M, 

(a)  Commutativity  of  Addition:  v  +  w  =  w  +  v. 

(b)  Associativity  of  Addition:  u  +  (v  +  w)  =  (u  +  v)  +  w. 

(c)  Additive  Identity:  There  is  a  zero  element  0  E  V  satisfying  v  +  0  =  v  =  0  +  v. 

(d)  Additive  Inverse:  For  each  v  E  V  there  is  an  element  — v  E  V  such  that 

v+  (-v)  =  0  =  (-v)  +  v. 

(e)  Distributivity:  (c  +  d)v  =  (cv)  +  (dv),  and  c(v  + w)  =  (cv)  +  (cw). 

(f)  Associativity  of  Scalar  Multiplication:  c(rfv)  =  (cd)v. 

(g)  Unit  for  Scalar  Multiplication:  the  scalar  1  E  M  satisfies  1  v  =  v. 

Complex  vector  spaces  are  defined  in  an  identical  manner,  the  only  difference  being 
that  the  scalars  are  allowed  to  be  complex  numbers.  In  this  case,  the  prototype  is  the  space 
Cn  consisting  of  column  vectors  with  n  complex  entries. 

While  finite-dimensional  vector  spaces  play  a  significant  role  in  the  study  of  partial 
differential  equations,  particularly  in  the  design  of  numerical  solution  schemes,  for  us  the 
more  important  examples  are  infinite-dimensional  vector  spaces  whose  elements  ( “vectors” ) 
are  functions.  The  main  example  is  the  following: 
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Example  B.2.  Let  /  C  M  be  an  interval.  The  function  space  T  =  fF(I),  whose 
elements  are  all  real- valued  functions  f(x)  dehned  for  x  E  /,  has  the  structure  of  a  vector 
space.  Addition  of  functions  in  T  is  dehned  in  the  usual  manner:  (/  g)(x)  =  f(x)  +  g(x) 

for  all  x  E  /.  Multiplication  by  scalars  c  E  M  is  the  same  as  multiplication  by  constants, 
(cf)(x)  =  cf(x).  The  zero  element  is  the  constant  function  that  is  identically  0  for  all 
x  E  I.  With  these  operations,  all  the  vector  space  axioms  listed  in  Definition  B.l  are  valid, 
and  hence  fF(I)  is  a  real  vector  space. 

More  generally,  if  C  Mn  is  any  subset  of  n-dimensional  Euclidean  space,  the  function 
space  ^(fi)  is  dehned  as  the  set  of  all  real-valued  functions  /(aq, . . .  ,xn)  dehned  for  all 
x  =  xn)  E  O.  Addition  and  scalar  (constant)  multiplication  of  functions  are 

dehned  in  the  same  manner. 


A  subspace  of  a  vector  space  V  is  a  subset  W  C  V  that  is  a  vector  space  in  its  own 
right.  In  particular,  a  subspace  W  must  contain  the  zero  element  of  V. 


Proposition  B.3.  A  nonempty  subset  W  C  V  of  a  vector  space  is  a  subspace  if  and 
only  if 

(a)  for  every  v,  w  E  W,  the  sum  v  +  w  E  W,  and 

(b)  for  every  v  E  W  and  every  c  E  M,  the  scalar  product  cv  E  W. 

For  example,  a  complete  list  of  subspaces  of  V  =  M3  is  (z)  the  origin  {0};  (ii)  every 
line  through  the  origin;  (Hi)  every  plane  through  the  origin;  (iv)  all  of  IR3. 


Example  B.4.  Here  are  some  examples  of  subspaces  of  the  function  space  fF(I). 

(a)  The  space  V ^  of  polynomials  of  degree  <  n. 

(b)  The  space  C°(/)  of  all  continuous  functions  on  the  interval  I. 

(c)  The  space  C 71  (I)  consisting  of  all  functions  f(x)  that  have  n  continuous  derivatives 

f"(x), .  .  .,f(n\x)  (W  I. 

(d)  The  space  C°°(/)  =  f1n>o  Cn(/)  of  inhnitely  differentiable,  or  smooth ,  functions  is 

also  a  subspace. 

(e)  The  space  A(I)  of  analytic  functions.  Recall  that  a  function  f(x)  is  called  analytic 

at  a  point  a  if  it  is  smooth,  and,  moreover,  its  Taylor  series 


f(a )  +  f\a )  (x  —  a)  +  \  f"(a)  (x  —  a)2  +  •  •  • 


n  —  0 


(B.l) 


converges  to  f(x)  for  all  x  sufficiently  close  to  a.  (The  series  is  not  required  to 
converge  on  the  entire  interval  I.)  Not  every  smooth  function  is  analytic,  and  so 
A(I)  C  C°°(/);  see  Exercise  11.3.21  for  an  explicit  example. 


B.2  Bases  and  Dimension 


Definition  B.5.  Let  v1, . . . ,  vk  belong  to  a  vector  space  V.  A  sum  of  the  form 

k 

civi  +  C2V2  +  ■  •  ■  +  cfcvfc  =  c;v;>  (B-2) 

i  —  1 


We  use  one-sided  derivatives  at  any  endpoint  that  belongs  to  the  interval. 
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where  the  coefficients  c1,  c2, . . . ,  ck  are  any  scalars,  is  known  as  a  linear  combination  of  the 
elements  v1, . . . ,  vfe.  Their  span  is  the  subspace  W  =  span  {vl5 . . . ,  vk}  C  V  consisting  of 
all  possible  linear  combinations. 

Definition  B.6.  The  elements  vl5 . . . ,  vk  E  V  are  called  linearly  dependent  if  there 
exist  scalars  cq, . . . ,  cfc,  not  all  zero ,  such  that 

CiVi+  •••  +Ckvk  =  0.  (B.3) 

Elements  that  are  not  linearly  dependent  are  called  linearly  independent. 

In  particular,  a  collection  of  functions  f1(x),...,fn(x)  is  linearly  dependent  if  and 
only  if  there  exist  constants  c1? . . . ,  cn,  not  all  zero ,  such  that  the  linear  combination 

cl/l(*)+  •••  +Cnfn(X)=°  (B-4) 

is  identically  zero.  Conversely,  if  the  only  choice  of  constants  for  which  (B.4)  holds  is 
c1  —  •  •  •  =  cn  —  0,  then  the  functions  are  linearly  independent. 

Definition  B.7.  A  basis  of  a  vector  space  V  is  a  finite  collection  of  elements 
vl5 . . . ,  vn  E  V  that  (a)  spans  V,  and  (b)  is  linearly  independent. 

The  simplest  example  is  the  standard  basis  of  Mn,  consisting  of  the  n  vectors 


/ 1  \  /o\  /o\ 


0 

1 

0 

el  = 

0 

5  e2  — 

0 

p  — 

•  • • •  ? 

0 

,  (B.5) 

0 

0 

0 

\0/ 

\0/ 

\  1  / 

so  that  is  the  vector  with  1  in  the  ith  slot  and  0’s  elsewhere.  However,  there  are  many 
other  bases  of  Mn;  indeed,  any  n  linearly  independent  vectors  v1? . . . ,  vn  E  Mn  form  a  basis. 

Lemma  B.8.  The  elements  v1, . . . ,  vn  form  a  basis  of  V  if  and  only  if  every  v  E  V 
can  be  written  uniquely  as  a  linear  combination  of  the  basis  elements : 

n 

V  =  C1V1+  •••  +CnVn  =  E  CiV'  (B-6) 

i  —  1 

The  coefficients  (cx, . . . ,  cn)  are  called  the  coordinates  of  the  vector  v  with  respect  to  the 
given  basis. 

Theorem  B.9.  Suppose  the  vector  space  V  has  a  basis  v1, . . . ,  vn.  Then  every  other 
basis  of  V  has  the  same  number  of  elements  in  it.  This  number  is  called  the  dimension  of 
V ,  and  written  dim  V  —  n. 

On  the  other  hand,  if  the  vector  space  contains  infinitely  many  linearly  independent 
elements,  then  it  does  not  have  a  basis  in  the  sense  of  Definition  B.7,  and  is  thus  infinite¬ 
dimensional.  All  of  the  function  spaces  and  subspaces  listed  above  are  infinite-dimensional 
vector  spaces.  An  example  of  a  finite-dimensional  function  space  is  the  space  C  J-’(M) 
consisting  of  all  polynomials  p(x)  =  a0  +  axx  +  •  •  •  +  anxn  of  degree  <  n.  The  monomials 
1,  x,  x2, . . . ,  xn  form  a  basis,  and  hence  V ^  has  dimension  n  +  1.  (On  the  other  hand,  the 
vector  space  containing  all  polynomials  is  infinite-dimensional.) 
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The  dot  product  on  Euclidean  space  IRn  plays  an  essential  role  in  geometry,  analysis,  and 
mechanics.  Its  basic  properties  inspire  the  general  definition  of  an  inner  product  on  a 
vector  space. 

Definition  B.10.  An  inner  product  on  the  real  vector  space  V  is  a  pairing  that  takes 
two  elements  v,  w  E  V  and  produces  a  real  number  ( v ,  w )  E  M,  subject  to  the  following 
three  axioms  for  all  u,  v,  w  E  V,  and  scalars  c,  d  E  M. 

(z)  Bilinearity : 

cu  +  dv,w)  =  c(u,w)+d(v,w), 

(B.7) 


u,cv  +  dw)  =  c(u,v)  +  d(u,w 


(zz)  Symmetry : 


(v,w 


w ,  v 


(Hi)  Positivity : 

(v,v)>0  whenever  v/0, 


while 


0,0)  =  0. 


(B.8) 


(B.9) 


Given  an  inner  product,  the  associated  norm  of  an  element  v  E  V  is  defined  as  the 
positive  square  root  of  its  inner  product  with  itself: 


v 


V  ,  V 


(B.10) 


Bilinearity  of  the  inner  product  implies  that 


c  v 


c\\\v 


for  any  scalar  c.  The  positivity  axiom  implies  that 


v 


>  0  is  real  and  nonnegative. 


and  equals  0  if  and  only  if  v  =  0  is  the  zero  element.  A  vector  space  norm  induces  a 

.  In  particular, 


v  —  w 


notion  of  distance  between  elements  v,  w  E  V,  with  dist(v,w)  = 
dist(v,  w)  =  0  if  and  only  if  v  =  w. 

Example  B.ll.  The  most  familiar  example  of  an  inner  product  is  the  dot  product t 


r 


v  .  w )  =  v  •  w  =  v  w  =  zq  zzq  +  v2  w2  +  •••  -\-  vnw 


n 


(B.ll) 


on  the  Euclidean  space  IRn.  The  associated  Euclidean  norm 


v 


=  \J  V  •  V  =  \/v\  +  v\  +  •  •  •  +  v2n 


conforms  to  our  usual  notion  of  distance  between  points  in  Euclidean  space. 


(B.12) 


To  find  the  most  general  inner  product  on  IRn,  we  need  to  introduce  the  important 
class  of  positive  definite  matrices. 

Definition  B.12.  An  n  x  n  matrix  C  is  called  positive  definite  if  it  satisfies  the 
positivity  condition 

\TC\  >  0  for  all  o^vel".  (B.13) 

We  will  sometimes  write  C  >  0  to  mean  that  C  is  a  positive  definite  matrix. 


t 


v 


T 


5 


The  elements  v  E  Mn  are  to  be  regarded  as  column  vectors,  while  the  transpose,  written 
is  the  corresponding  row  vector. 
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Warning :  The  condition  C  >  0  does  not  mean  that  all  the  entries  of  C  are  positive. 
3  -1\ . .  ..  f  l  2 


For  example 


-1 


is  positive  definite,  whereas 


2  1 


is  not. 


Many  authors,  including  [89],  require  that  a  positive  definite  matrix  also  be  symmetric. 
We  will  not  impose  this  condition  here  a  priori.  However,  most  of  the  positive  definite 
matrices  we  will  encounter  in  applications  will  be  symmetric  (or,  more  generally,  self- 
adjoint  —  as  in  Example  9.15).  For  a  symmetric  matrix,  the  most  useful  test  for  positive 
definiteness  is  to  perform  Gaussian  Elimination  on  the  matrix  C,  which  is  positive  definite 
if  and  only  if  no  row  interchanges  are  needed,  and  all  the  pivots  are  positive,  [89]. 


Proposition  B.13.  Every  inner  product  on  IRn  is  given  by 

( v ,  w )  =  vTC  w  for  v,weRn,  (B.14) 

where  C  >  0  is  a  symmetric  positive  definite  matrix. 


The  next  example  is  of  particular  significance  in  Fourier  analysis  and  partial  differential 
equations. 


Example  B.14.  Let  [a,  b]  C  M  be  a  bounded  closed  interval.  The  integral 


f ,g)  =  /  f{x)g(x)dx 

J  a 


defines  an  inner  product  on  the  space  C°[a,  6]  of  continuous  functions, 
norm 


(B.15) 


The  associated 


f{x )2  dx 


(B.16) 


a 


is  known  as  the  L2  norm  of  the  function  /  over  the  interval  [a,  b].  The  positivity  of  the 
norm:  \\  f  \\  >  0  for  f  ^  0,  follows  from  the  fact  that  the  only  continuous  nonnegative 

fb 

function  g(x)  >  0  that  satisfies  /  g(x)  dx  =  0  is  the  zero  function  g{x)  =  0.  Extending 

J  a 

this  construction  to  spaces  containing  discontinuous  functions  is  trickier,  since  there  are 
discontinuous  functions  that  are  not  identically  zero,  but  nevertheless  have  zero  norm 
integral.  An  example  is  a  function  that  is  zero  except  at  a  single  point.  Further  discussion 
can  be  found  in  Section  3.5. 


The  two  most  important  inequalities  in  mathematical  analysis  apply  to  any  inner 
product  space. 

Theorem  B.15.  Every  inner  product  satisfies  the  Cauchy-Schwarz  and  triangle 
inequalities 


(v,W 


< 


V 


w 


V  +  w  < 


V 


+ 


w 


for  all  v,  w  E  V.  (B.17) 


Equality  holds  if  and  only  if  v  and  w  are  parallel,  i.e.,  scalar  multiples  of  each  other. 

Proof :  We  begin  with  the  Cauchy-Schwarz  inequality:  |  ( v  ,  w  )  |  <  ||  v  ||  ||  w  ||.  The 
case  w  =  0  is  trivial,  and  so  we  assume  w  ^  0.  Let  t  £  M  be  an  arbitrary  scalar.  Using 
the  three  inner  product  axioms,  we  have 


0<  v  +  tw  =  ( v  +  t  w  ,  v  +  t  w  )  = 


V 


+  2 1  ( v  ,  w ) 


w 


(B.18) 
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with  equality  holding  if  and  only  if  v  =  —  t  w,  which  requires  v  and  w  to  be  parallel  vectors. 
We  Lx  v  and  w,  and  consider  the  right-hand  side  of  (B.18)  as  a  quadratic  function  of  t. 

w  ||-2  (v,w).  Substituting  this  value  into  (B.18), 


Its  minimum  value  occurs  when  t  = 
we  obtain 


0  < 


v 


2  _  2  A  ’ w 


w 


+ 


V  ,  w 


w 


V 


V  ,  w 


w 


and  hence  (v,w)2  <  ||v||2||w||2,  which,  upon  taking  the  square  root,  establishes  the 
Cauchy-Schwarz  inequality.  Again,  as  noted  above,  equality  holds  if  and  only  if  v  and  w 
are  parallel. 

To  establish  the  triangle  inequality,  we  compute 


v  +  w  2  =  (v  +  w,v  +  w)  = 

V 

2  +  2  ( v  , 

w )  +  w  2 

< 

V 

2  +  2  1 

V 

1 

w  + 

w 

l2  =  (l 

V 

1  + 

w 

where  the  middle  inequality  follows  from  the  Cauchy-Schwarz  inequality  (which  clearly 
also  holds  if  the  absolute  value  is  removed.)  Taking  square  roots  of  both  sides  completes 
the  proof.  Q.E.D. 


We  will  also  have  occasion  to  use  inner  products  on  complex  vector  spaces.  To  ensure 
that  the  associated  norm  remains  positive,  the  real  dehnition  must  be  modihed.  The 
complex  conjugate  of  a  complex  scalar  c  =  a  +  i  6,  with  a,  b  E  M,  will  be  indicated  by  an 
overbar:  c  =  a  —  i  b.  When  dealing  with  a  complex  inner  product  space,  one  must  pay 
careful  attention  to  complex  conjugation. 


Definition  B.16.  An  inner  product  on  the  complex  vector  space  V  is  a  pairing  that 
takes  two  vectors  v,w  E  V  and  produces  a  complex  number  (v,w)  E  C,  subject  to  the 
following  requirements,  for  u,  v,  w  E  V,  and  c,  d  E  C: 

(i)  Sesquilinearity : 

(  cu  +  d  v  ,  w )  =  c(u,w)  +  d  ( v  ,  w ), 

(  u  ,  c  v  +  d  w )  =  c(u,v)  +  d  (  u  ,  w ). 

(ii)  Conjugate  Symmetry : 

(v,w)  =  (w,v). 


(B.19) 

(B.20) 


(in)  Positivity. 


v 


=  ( v ,  v )  >  0 


and  ( v  ,  v  )  =  0  if  and  only  if  v  =  0.  (B.21) 


Example  B.1T.  The  simplest  example  is  the  Hermitian  dot  product 

i  zA  i  wi  \ 


rri 

z  •  w  =  z  W  =  z1w1  -\-  z2w2  ~\~  •  •  •  +  zn  wn ,  for  z 


between  complex  vectors  v,  w  E  Cn. 


\  Ai  J 


w 


w< 


\Wn/ 


(B.22) 


Example  B.18.  Let  C°[  —  tt,  tt ]  denote  the  complex  vector  space  consisting  of  all 
complex- valued  continuous  functions  f(x)  =  u(x)  +  \v(x)  depending  on  the  real  variable 
— tt  <  x  <7i.  The  L2  Hermitian  inner  product  on  C°[  —  tt  ,  tt  ]  is  defined  as 


/Tt 

f(x)  g(x)  dx  ,  (B.23) 

-Tt 
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i.e.,  the  integral  of  /  times  the  complex  conjugate  of  g ,  with  corresponding  norm 


/  r 

/  r 

11/11  ^  \ 

j  J  1  f(x)  2  dx  =  4 

/  /  u{x )2  +  v[x )2  dx  . 

(B.24) 

Inner  products  on  complex  vector  spaces  also  satisfy  the  Cauchy-Schwarz  and  triangle 
inequalities  (B.17).  The  proof  is  left  as  an  exercise  for  the  reader;  see  [89;  Exercise  3.6.46]. 

B.4  Orthogonality 


Definition  B.19.  Two  elements  v,w  E  V  of  an  inner  product  space  V  are  called 
orthogonal  if  their  inner  product  vanishes:  ( v  ,  w )  =  0. 

For  ordinary  Euclidean  space  equipped  with  the  dot  product,  two  vectors  are  orthog¬ 
onal  if  and  only  if  they  are  perpendicular,  i.e.,  meet  at  a  right  angle. 


if 


Definition  B.20.  A  basis  u1? . . . ,  un  of  an  inner  product  space  V  is  called  orthogonal 
=  0  for  all  i  ^  j.  The  basis  is  called  orthonormal  if,  in  addition,  each  vector 


u*  >  uj 


has  unit  length: 


u 


=  1,  for  all  2  =  1 


n. 


For  example,  the  standard  basis  vectors  (B.5)  form  an  orthonormal  basis  of  IRn  with 
respect  to  the  dot  product,  but  they  are  not  orthonormal  for  any  other  inner  product 
thereon. 


Theorem  B.21.  Ifv1,...,vn  form  an  orthogonal  basis, 

coordinates  of  a  vector 

v  =  a1v1+  •  •  •  +  an  vn  are  given  by  ai  = 
Moreover,  the  vector’s  norm  can  be  computed  using  the  formula 


then  the  corresponding 
.  (B.25) 

Vi 

(B.26) 


Proof:  We  compute  the  inner  product  of  (B.25)  with  one  of  the  basis  vectors.  By 
orthogonality, 


n 


n 


(v>vi 


E 

3  =  1 


a3  V  ’  V* 


E  qj  < u/  >  °- 

3  =  1 


ai 


Vi 


To  prove  formula  (B.26),  we  similarly  expand 


n 


n 


V 


(v,v 


E 

*1.7  =  1 


ai  \  Vi  ’  Vi 


E 

1=1 


Q.E.D. 


In  the  case  of  an  orthonormal  basis,  the  formulas  (B. 25-26)  simplify  to 


v  =  Ciui+  +cnun,  where  ci  =  (v,ui), 


V 


ci+  •  ■  •  +  <4 


(B.27) 
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Example  B.22.  A  particularly  important  orthogonal  basis  is  provided  by  the  fol¬ 
lowing  vectors  lying  in  Cn: 


(jj 


k 


=  (i,Cfc,C2fe,C3fc,---,C(n“1)fe)T 

/  -i  2/c7ri/n  Akiri 
1,  C  ,  C 


/  n 


,2(n  —  1)  fc 7r  i  /n  ^  ^ 


k  —  0, . 


,n  -  1, 


(B.28) 


where 


—  ^27ri/n 


c  =  e 


Orthogonality  relies  on  the  fact  that  its  powers,  (k  =  e2k7T1/n,  k 
complex  roots  of  the  elementary  polynomial 


0, . . . ,  n 


zn-l  =  (z-l)(l  +  z  +  zz  +  +Z71-1) 


while 


C  =  e~2ni/n  =  C1 


(B.29) 
1,  are  the 

(B.30) 


Since  when  0  <  k  <  n  —  1,  the  complex  number  (k  ^  1  is  a  root  of  the  polynomial  (B.30), 
it  must  also  be  a  root  of  the  second  factor.  This  implies  that 


l  _|_  . ..  _|_  ^(n— i)  k  _ 


n, 

0, 


k  =  0  mod  n. 
k  ^  0  mod  n, 


where  the  former  case  k  =  0  mod  n  follows  by  direct  substitution  of  (k  =  1 
Hermitian  inner  products  of  the  vectors  (B.28)  equal 


Thus,  the 


71—1  71—1 

j  =  0  j  =  o 


n, 

0, 


k  =  L 

k^l 


(B.31) 


provided  0  <  k,l  <  n  —  1,  thereby  establishing  orthogonality.  These  vectors  are  the  dis¬ 
crete  analogues  of  the  orthogonal  complex  exponential  functions  that  are  used  to  construct 
complex  Fourier  series.  They  are  the  basis  of  the  discrete  Fourier  transform,  [89;  §5.7], 
and  their  orthogonality  is  the  key  to  modern  signal  processing. 


B.5  Eigenvalues  and  Eigenvectors 

The  eigenvalues  and  eigenvectors  of  a  matrix  first  appear  when  solving  linear  systems 
of  ordinary  differential  equations.  But  their  essential  importance  extends  across  all  of 
mathematics  and  its  manifold  applications.  Extensions  of  the  eigenvalue  method  to  linear 
operators  on  function  spaces  are  critical  to  the  analysis  of  partial  differential  equations. 

Definition  B.23.  Let  A  be  an  n  x  n  matrix.  A  scalar  A  is  called  an  eigenvalue  of  A 
if  there  is  a  nonzero  vector  v/0,  called  an  associated  eigenvector ,  such  that 

Av  =  Av.  (B.32) 

In  particular,  a  matrix  has  A  =  0  as  an  eigenvalue  if  and  only  if  it  has  a  null  eigenvector 
v/0,  satisfying  Av  =  0,  and  hence  is  a  singular  (non-invertible)  matrix,  with  vanishing 
determinant:  det  A  =  0.  An  eigenvalue  is  called  simple  if  it  admits  only  one  linearly 
independent  eigenvalue;  more  generally,  the  multiplicity  of  an  eigenvalue  is  defined  as  the 
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dimension  of  the  eigenspace  consisting  of  all  solutions  to  the  eigenequation  (B.32),  including 
0.  Thus,  a  simple  eigenvalue  has  multiplicity  1. 

Even  if  A  is  a  real  matrix,  we  must  allow  the  possibility  of  complex  eigenvectors. 
Matrices  with  a  “complete”  set  of  eigenvectors  are  the  most  common,  and  also  the  easiest 
to  deal  with. 


Definition  B.24.  An  n  x  n  real  or  complex  matrix  A  is  called  complete  if  there 
exists  a  basis  of  Cn  consisting  of  its  (complex)  eigenvectors. 

It  is  not  hard  to  show  that  eigenvectors  corresponding  to  different  eigenvalues  are 
necessarily  linearly  independent.  This  means  that  matrices  with  all  distinct  (and  hence 
simple)  eigenvalues  are  necessarily  complete: 

Proposition  B.25.  Any  n  x  n  matrix  with  n  distinct  eigenvalues  is  complete. 


Unfortunately,  not  all  matrices  with  repeated  eigenvalues  are  complete.  For  instance 

1  0\  .  ,  ,  .  ,  .  ,  { 1\  ,  ft) 

0 


0  1 

C2,  whereas 


is  complete,  since,  for  instance, 

1  1 


and 


1 


form  an  eigenvector  basis  of 

1 


0  1 


is  not,  since  it  has  only  one  independent  eigenvector,  namely 


0 


Incomplete  matrices  are  much  more  challenging  to  deal  with,  both  theoretically  and  nu¬ 
merically.  Fortunately,  we  can  safely  ignore  the  incomplete  cases  in  this  text. 

The  most  common  way  for  orthogonal  bases  to  arise  is  as  eigenvector  bases  of  sym¬ 
metric  matrices.  (Orthogonality  is  with  respect  to  the  standard  dot  product  on  Mn.)  The 
extension  of  this  result  to  “self-adjoint”  operators  on  function  space  forms  the  foundation 
of  Fourier  analysis  and  its  generalizations. 


Theorem  B.26.  Let  A  =  AT  be  a  real  symmetric  n  x  n  matrix.  Then 

(a)  All  the  eigenvalues  of  A  are  real. 

(b)  Eigenvectors  corresponding  to  distinct  eigenvalues  are  orthogonal. 

(c)  There  is  an  orthonormal  basis  of  IRn  consisting  of  n  eigenvectors  of  A. 


Let  us  demonstrate  orthogonality,  leaving  the  remaining  steps  in  the  proof  to  [89; 
Theorem  8.20].  If 


Tv  =  Av.  Tw  =  /rw, 


where  are  distinct  real  eigenvalues,  then,  by  symmetry  of  T, 


A  v  •  w  =  (Tv)  •  w  =  (T v)Tw  =  vtTw  =  v  •  (Tw)  =  v  •  (g  w)  =  jllv  •  w, 
and  hence 

(A  —  ji)  v  •  w  =  0. 

Since  A^/i,  this  implies  that  the  eigenvectors  v,w  are  necessarily  orthogonal. 


B.6  Linear  Iteration 

For  numerical  applications,  we  will  require  some  basic  results  on  iteration  of  linear  systems. 
Consider  first  a  homogeneous  linear  iterative  system  of  the  form 

U(fe+1)  =  AVk\  u<°)  =  u0,  (B.33) 
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in  which  A  is  an  n  x  n  matrix  and  u0  E  IRn  or  Cn.  The  solution  to  such  a  system  is 
evidently  obtained  by  repeatedly  multiplying  the  initial  vector  u0  by  the  matrix  A,  and  so 

u(fc)=afcu0.  (B.34) 

Definition  B.2T.  A  matrix  A  is  called  convergent  if  every  solution  to  the  homo¬ 
geneous  linear  iterative  system  (B.33)  tends  to  zero  in  the  limit:  -E  0  as  k  -E  oo. 

Equivalently,  A  is  convergent  if  and  only  if  its  powers  converge  to  the  zero  matrix:  Ak  -E  O 
as  k  -A  oo. 


The  solution  formula  (B.34),  while  elementary,  is  not  particularly  enlightening.  An 
alternative  approach  is  to  recognize  that  if  A  •  is  an  eigenvalue  of  A  and  v  •  a  corresponding 
eigenvector,  then 


\kv- 

3  3 


(B.35) 


is  a  solution,  since 


^uf>  =  A‘Av, 


=  A‘« 


(fc+  1) 

V  =  u 

3  3 


Moreover,  linear  combinations  of  such  eigensolutions  are  also  solutions.  In  particular,  if  A 
is  complete,  then  we  can  write  down  the  general  solution  to  (B.33)  as  a  linear  combination 
of  the  independent  eigensolutions: 


U(fe)  =  c1gv1  +  c2A^v2  +  +Cn  \knxn,  (B.36) 

where  {vx, . . . ,  vn}  is  the  eigenvector  basis.  The  coefficients  cl5 . . . ,  cn  are  uniquely  deter¬ 
mined  by  the  initial  conditions, 


u(0)  =  CiVi  +C2v2  +  •••  +cnvn=u0, 

which  relies  on  the  fact  that  the  eigenvectors  vl5 . . . ,  vn  form  a  basis.  Now,  A  is  convergent 
if  and  only  if  all  solutions  u ^  -E  0.  The  individual  eigensolution  (B.35)  goes  to  zero  if  and 
only  if  its  associated  eigenvalue  is  strictly  less  than  1  in  modulus:  |  A  ■  |  <  1.  This  proves 
the  following  result  for  complete  matrices.  The  proof  in  the  incomplete  case  relies  on  the 
Jordan  canonical  form,  [89;  Chapter  10]. 


A 


Theorem  B.28. 

<  1. 


The  matrix  A  is  convergent  if  and  only  if  all  its  eigenvalues  satisfy 


Definition  B.29.  The  spectral  radius  of  a  matrix  A  is  defined  as  the  maximal  mod¬ 
ulus  of  all  of  its  real  and  complex  eigenvalues:  p(A)  =  max  {  |  Ax  |, . . . ,  |  Xk  |  }. 

Corollary  B.30.  The  matrix  A  is  convergent  if  and  only  if  p(A)  <  1. 

Indeed,  the  spectral  radius  essentially  governs  the  rate  of  convergence  of  the  iterative 
system  —  the  closer  it  is  to  0,  the  faster  the  convergence  rate. 

Next,  consider  the  inhomogeneous  linear  iterative  system 


v(fe+1)  =  Av(k)  +  b  v(0)  =  Vq!  (B.37) 

where  b  a  fixed  vector.  A  fixed  point  is  a  vector  v*  that  satisfies 


v*  =  Av*  +  b,  or,  equivalently,  (I  —  A)v*  =  b,  (B.38) 

where  I  is  the  identity  matrix  of  the  same  size  as  A.  Thus,  if  1  is  not  an  eigenvalue  of 
A  (which  cannot  happen  when  A  is  convergent),  then  I  —  A  is  nonsingular,  and  so  the 
iterative  system  has  a  unique  fixed  point. 
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Theorem  B.31.  Assume  that  1  is  not  an  eigenvalue  of  A.  Then  all  solutions  to 
(B.37)  converge  to  the  hxed  point,  v*  as  k  oo  if  and  only  if  A  is  a  convergent 

matrix. 

Proof :  Let  u ^  —  v*,  so  that  v*  if  and  only  if  u ^  0.  Now, 

u(fc+1)  =  v(fe+1)  -  V*  =  (Av(fe)  +  b)  -  (iv*  +  b)  =  a(VW  -  V*)  =  Au(k\ 

and  hence  u(/’:)  solves  the  homogeneous  version  (B.33).  Thus,  the  result  is  an  immediate 
consequence  of  Definition  B.27.  Q.E.D. 


B.7  Linear  Functions  and  Systems 

The  most  basic  structural  features  of  linear  differential  equations,  both  ordinary  and 
partial,  linear  boundary  value  problems,  etc.,  are  founded  on  the  concept  of  a  linear  function 
between  vector  spaces. 

Definition  B.32.  Let  U  and  V  be  real  vector  spaces.  A  function  L:U  -G  V  is  called 
linear  if  it  obeys  two  basic  rules: 


L[u  +  v]  =  L[u]  +  L[v 


L[cu]  =  cL[u 


(B.39) 


for  all  u,  v  G  U  and  all  scalars  c. 


We  will  refer  to  U  as  the  domain  space  of  the  function  L,  and  V  as  the  target  space , 
The  latter  is  to  emphasize  the  fact  that  the  range  of  L,  namely 


rng  L  =  {  v  G  L  |  v  =  L[u]  for  some  u  G  U  } 
may  very  well  be  a  proper  subspace  of  the  target  space  V. 


(B.40) 


Theorem  B.33.  Every  linear  function  L:IRn  -G  IRm  is  given  by  matrix  multiplica¬ 
tion,  L[v]  =  Av,  where  A  is  an  m  x  n  matrix. 

Proving  that  matrix  multiplication  satisfies  the  linearity  conditions  (B.39)  is  easy.  The 
converse  is  established  by  seeing  what  the  linear  function  does  to  the  basis  vectors  of  IRn; 
see  [89;  Theorem  7.5]. 


Corollary  B.34.  Every  linear  function  L:  Mn 
with  a  hxed  vector  a  G  Mn: 

L[v]  =  a  •  v. 


M  is  given  by  taking  the  dot  product 

(B.41) 


When  U  is  a  function  space,  a  linear  function  is  also  referred  to  as  a  linear  operator 
in  order  to  avoid  confusion  with  the  elements  of  U .  If  the  target  space  W  =  M,  then  the 
term  linear  functional  is  also  often  used  for  L:U  -G  M. 

Here  are  some  representative  examples  that  arise  in  applications. 

Example  B.35.  (a)  Evaluation  of  a  function  at  a  point,  namely  L[f]  =  f(x0), 

defines  a  linear  operator  L:  C°[a,  b]  -G  M. 

(b)  Integration, 


/[/]  =  /  m 

J  a 


dx. 


(B.42) 


also  defines  a  linear  functional  I:  C°[a,  b]  -G  M. 


586 


B  Linear  Algebra 


(c)  The  operation  Ma[f(x)\  =  a(x)  f(x)  of  multiplication  by  a  continuous  function 
a  defines  a  linear  operator  Ma :  C °[a,b]  -A  C °[a,  &  ]. 

(d)  Differentiation  of  functions,  D[f]  =  /',  serves  to  define  a  linear  operator 

DiC^b]  C°[a,  b}. 

(e)  A  general  linear  ordinary  differential  operator  of  order  n, 

L  =  an(x)  Dn  +  an_1(x)  Dn~1  +  •••  +  a1(x)  D  +  a0(x),  (B.43) 

is  obtained  by  summing  such  operators.  If  the  coefficient  functions  a0(x), . . . ,  an(x)  are 
continuous,  then 


dn  v 

L[u ]  =  an(x )  +  an_1(x) 


d 


n— 1 


U 


dxn  1 
o 


/  x  du  ,  . 

+  •••  +  a-,  pr)  - — 

dx 


(B.44) 


defines  a  linear  operator  from  Cn[a,6]  to  Cu[a,6] 


Linear  partial  differential  equations  are  based  on  linear  partial  differential  operators, 
which  are  discussed  in  Chapter  1.  They  are  particular  examples  of  the  general  concept  of 
a  linear  system. 


Definition  B.36.  A  linear  system  is  an  equation  of  the  form 


(B.45) 


in  which  L:U  V  is  a  linear  function,  f  G  b,  while  the  desired  solution  u  E  U .  The 
system  is  homogeneous  if  f  =  0;  otherwise,  it  is  called  inhomogeneous. 


Note  that,  by  the  definition  (B.40)  of  the  range  of  L,  the  linear  system  (B.45)  will 
have  a  solution  if  and  only  if  f  G  rng  L.  In  particular,  a  homogeneous  linear  system  always 
has  a  solution,  namely  u  =  0.  However,  it  may  possibly  admit  other,  nonzero,  solutions. 


Theorem  B.37.  If z1,...,zfc  are  all  solutions  to  the  same  homogeneous  linear  system 


L[  z]  =0,  (B.46) 

then  any  linear  combination  c1  z1  +  •  •  •  +  ck  zk,  for  any  scalars  cx, . . . ,  ck,  is  also  a  solution. 

In  other  words,  the  set  of  solutions  to  a  homogeneous  linear  system  (B.46)  forms  a 
subspace  of  the  domain  space  [/,  known  as  the  kernel  of  the  linear  function  L: 


kerL  =  {zG?7|  L[ z]  =  0  } 


(B.47) 


Theorem  B.38.  If  the  inhomogeneous  linear  system  L[ u]  =  f  has  a  particular 
solution  u*,  which  requires  f  G  rngL,  then  the  general  solution  is  u  =  u*  +  z,  where 
z  G  kerL  is  any  solution  to  the  corresponding  homogeneous  system  L[ z]  =  0. 

The  Superposition  Principle  for  inhomogeneous  linear  systems  allows  us  to  combine 
solutions  corresponding  to  different  right-hand  sides. 

Theorem  B.39.  Suppose  that  for  each  i  —  1, . . . ,  k,  we  know  a  particular  solution 
to  the  inhomogeneous  linear  system  L[ u]  =  fi  for  some  fi  G  rngL.  Then ,  given  scalars 


c 


15 


ck,  a  particular  solution  to  the  combined  inhomogeneous  system 


L[  u]  =  c1f1  + 


T  Cu  f 


kLk 


(B.48) 
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is  the  corresponding  linear  combination 

u*  =  ciui  +  ---  +  cfcufc  (B-49) 

of  particular  solutions.  The  general  solution  to  the  inhomogeneous  system  (B.48)  is 

u  =  u*  +  z  =  c1u^+  •••  +  ck\i*k+z ,  (B.50) 

where  z  E  kerL  is  an  arbitrary  solution  to  the  associated  homogeneous  system  L[z]  =  0. 


References 


[1] 

[2] 

[3] 

[4] 

[5] 

[6] 

[7] 

[8] 
[9] 

[10] 

[11 

[12 

[13] 

[14] 

[15] 

[16] 

[17] 

[18] 


Abdulloev,  K.O.,  Bogolubsky,  I.L.,  and  Makhankov,  V.  G.,  One  more  example  of 
inelastic  soliton  interaction,  Phys.  Lett.  A  56  (1976),  427-428. 

Ablowitz,  M.J.,  and  Clarkson,  P.  A.,  Solitons,  Nonlinear  Evolution  Equations  and 
the  Inverse  Scattering  Transform ,  L.M.S.  Lecture  Notes  in  Math.,  vol.  149, 
Cambridge  University  Press,  Cambridge,  1991. 

Abraham,  R.,  Marsden,  J.E.,  and  Ratiu,  T.,  Manifolds,  Tensor  Analysis,  and 
Applications ,  Springer- Verlag,  New  York,  1988. 

Airy,  G.  B.,  On  the  intensity  of  light  in  the  neighborhood  of  a  caustic,  Trans. 
Cambridge  Phil.  Soc.  6  (1838),  379-402. 

Aki,  K.,  and  Richards,  P.  G.,  Quantitative  Seismology ,  W.  H.  Freeman,  San 
Francisco,  1980. 

Ames,  W.  F.,  Numerical  Methods  for  Partial  Differential  Equations ,  3rd  ed., 
Academic  Press,  New  York,  1992. 

Antman,  S.S.,  Nonlinear  Problems  of  Elasticity ,  Appl.  Math.  Sci.,  vol.  107, 
Springer- Verlag,  New  York,  1995. 

Apostol,  T.  M.,  Calculus ,  Blaisdell  Publishing  Co.,  Waltham,  Mass.,  1967-1969. 

Apostol,  T.  M.,  Introduction  to  Analytic  Number  Theory ,  Springer- Verlag,  New 
York,  1976. 

Atkinson,  K.,  and  Han,  W.,  Spherical  Harmonics  and  Approximations  on  the  Unit 
Sphere:  An  Introduction ,  Lecture  Notes  in  Math.,  vol.  2044,  Springer,  Berlin, 
2012. 

Bank,  S.B.,  and  Kaufman,  R.  P.,  A  note  on  Holder’s  theorem  concerning  the 
gamma  function,  Math.  Ann.  232  (1978),  115-120. 

Batchelor,  G.K.,  An  Introduction  to  Fluid  Dynamics ,  Cambridge  University  Press, 
Cambridge,  1967. 

Bateman,  H.,  Some  recent  researches  on  the  motion  of  fluids,  Monthly  Weather 
Rev.  43  (1915),  63-170. 

Benjamin,  T.B.,  Bona,  J.L.,  and  Mahony,  J.J.,  Model  equations  for  long  waves 
in  nonlinear  dispersive  systems,  Phil.  Trans.  Roy.  Soc.  London  A  272  (1972), 
47-78. 

Berest,  Y.,  and  Winternitz,  P.,  Huygens’  principle  and  separation  of  variables, 

Rev.  Math.  Phys.  12  (2000),  159-180. 

Berry,  M.V.,  Marzoli,  I.,  and  Schleich,  W.,  Quantum  carpets,  carpets  of  light, 
Physics  World  14(6)  (2001),  39-44. 

Birkhoff,  G.,  Hydrodynamics  —  A  Study  in  Logic,  Fact  and  Similitude ,  2nd  ed., 
Princeton  University  Press,  Princeton,  1960. 

Birkhoff,  G.,  and  Rota,  G.-C.,  Ordinary  Differential  Equations ,  Blaisdell  Publ.  Co., 
Waltham,  Mass.,  1962. 


P  J.  Olver,  Introduction  to  Partial  Differential  Equations,  Undergraduate  Texts  in  Mathematics, 
DOI  10.1007/978-3-319-02099-0,  ©  Springer  International  Publishing  Switzerland  2014 


589 


590 


References 


[19]  Black,  F.,  and  Scholes,  M.,  The  pricing  of  options  and  corporate  liabilities,  J. 

Political  Economy  81  (1973),  637-654. 

[20]  Blanchard,  P.,  Devaney,  R.  L.,  and  Hall,  G.R.,  Differential  Equations ,  Brooks-Cole 

Publ.  Co.,  Pacific  Grove,  Calif.,  1998. 

[21]  Boussinesq,  J.,  Theorie  des  ondes  et  des  remous  qui  se  propagent  le  long  d’un 

canal  rectangulaire  horizontal,  en  communiquant  au  liquide  contenu  dans  ce 
canal  des  vitesses  sensiblement  pareilles  de  la  surface  au  fond,  J.  Math.  Pares 
Appl.  17  (2)  (1872),  55-108. 

[22]  Boussinesq,  J.,  Essai  sur  la  theorie  des  eaux  courants,  Mem.  Acad.  Sci.  Inst.  Nat. 

France  23  (1)  (1877),  1-680. 

[23]  Boyce,  W.  E.,  and  DiPrima,  R.  C.,  Elementary  Differential  Equations  and  Boundary 

Value  Problems ,  7th  ed.,  John  Wiley  &  Sons,  Inc.,  New  York,  2001. 

[24]  Bradie,  B.,  A  Friendly  Introduction  to  Numerical  Analysis ,  Prentice-Hall,  Inc., 

Upper  Saddle  River,  N.J.,  2006. 

[25]  Bronstein,  M.,  Symbolic  integration  I:  Transcendental  Functions ,  Springer- Verlag, 

New  York,  1997. 

[26]  Burgers,  J.  M.,  A  mathematical  model  illustrating  the  theory  of  turbulence,  Adv. 

Appl.  Mech.  1  (1948),  171-199. 

[27]  Cantwell,  B.J.,  Introduction  to  Symmetry  Analysis ,  Cambridge  University  Press, 

Cambridge,  2003. 

[28]  Carleson,  L.,  On  the  convergence  and  growth  of  partial  sums  of  Fourier  series, 

Acta  Math.  116  (1966),  135-157. 

[29]  Carmichael,  R.,  The  Theory  of  Numbers ,  Dover  Publ.,  New  York,  1959. 

[30]  Chen,  G.,  and  Olver,  P.  J.,  Dispersion  of  discontinuous  periodic  waves,  Proc.  Roy. 

Soc.  London  469  (2012),  20120407. 

[31]  Coddington,  E.A.,  and  Levinson,  N.,  Theory  of  Ordinary  Differential  Equations. 

McGraw-Hill,  New  York,  1955. 

[32]  Cole,  J.  D.,  On  a  quasilinear  parabolic  equation  occurring  in  aerodynamics,  0. 

Appl.  Math.  9  (1951),  225-236. 

[33]  Courant,  R.,  Friedrichs,  K.  O.,  and  Lewy,  H.,  Uber  die  partiellen  Differenzen- 

gleichungen  der  mathematischen  Physik,  Math.  Ann.  100  (1928),  32-74. 

[34]  Courant,  R.,  and  Hilbert,  D.,  Methods  of  Mathematical  Physics ,  vol.  I, 

Interscience  Publ.,  New  York,  1953. 

[35]  Courant,  R.,  and  Hilbert,  D.,  Methods  of  Mathematical  Physics ,  vol.  II, 

Interscience  Publ.,  New  York,  1953. 

[36]  Drazin,  P.  G.,  and  Johnson,  R.  S.,  Solitons:  An  Introduction ,  Cambridge  University 

Press,  Cambridge,  1989. 

[37]  Dym,  H.,  and  McKean,  H.P.,  Fourier  Series  and  Integrals ,  Academic  Press,  New 

York,  1972. 

[38]  Evans,  L.  C.,  Partial  Differential  Equations ,  Grad.  Studies  Math.  vol.  19,  Amer. 

Math.  Soc.,  Providence,  R.I.,  1998. 

[39]  Feller,  W.,  An  Introduction  to  Probability  Theory  and  Its  Applications ,  3rd  ed.,  J. 

Wiley  &  Sons,  New  York,  1968. 

[40]  Fermi,  E.,  Pasta,  J.,  and  Ulam,  S.,  Studies  of  nonlinear  problems.  I.,  preprint, 

Los  Alamos  Report  LA  1940,  1955;  in:  Nonlinear  Wave  Motion ,  A.  C.  Newell, 
ed.,  Lectures  in  Applied  Math.,  vol.  15,  American  Math.  Soc.,  Providence, 

R.I.,  1974,  pp.  143-156. 

[41]  Forsyth,  A.  R.,  The  Theory  of  Differential  Equations.  Cambridge  University  Press, 

Cambridge,  1890,  1900,  1902,  1906. 

[42]  Fourier,  J.,  The  Analytical  Theory  of  Heat ,  Dover  Publ.,  New  York,  1955. 


References 


591 


[43]  Gander,  M.J.,  and  Kwok,  F.,  Chladni  figures  and  the  Tacoma  bridge:  motivating 

PDE  eigenvalue  problems  via  vibrating  plates,  SIAM  Review  54  (2012), 
573-596. 

[44]  Garabedian,  P.,  Partial  Differential  Equations ,  2nd  ed.,  Chelsea  Publ.  Co.,  New 

York,  1986. 

[45]  Gardner,  C.S.,  Greene,  J.M.,  Kruskal,  M.D.,  and  Miura,  R.  M.,  Method  for 

solving  the  Korteweg-de Vries  equation,  Phys.  Rev.  Lett.  19  (1967),  1095-1097. 

[46]  Gonzalez,  R.  C.,  and  Woods,  R.  E.,  Digital  Image  Processing ,  2nd  ed., 

Prentice-Hall,  Inc.,  Upper  Saddle  River,  N.J.,  2002. 

[47]  Gordon,  C.,  Webb,  D.L.,  and  Wolpert,  S.,  One  cannot  hear  the  shape  of  a  drum, 

Bull.  Amer.  Math.  Soc.  27  (1992),  134-138. 

[48]  Gradshteyn,  I.S.,  and  Ryzhik,  I.W.,  Table  of  Integrals ,  Series  and  Products , 

Academic  Press,  New  York,  1965. 

[49]  Gurtin,  M.E.,  An  Introduction  to  Continuum  Mechanics ,  Academic  Press,  New 

York,  1981. 

[50]  Haberman,  R.,  Elementary  Applied  Partial  Differential  Equations ,  3rd  ed., 

Prentice-Hall,  Inc.,  Upper  Saddle  River,  NJ,  1998. 

[51]  Hairer,  E.,  Lubich,  C.,  and  Wanner,  G.,  Geometric  Numerical  Integration , 

Springer- Verlag,  New  York,  2002. 

[52]  Hale,  J.K.,  Ordinary  Differential  Equations.  2nd  ed.,  R.E.  Krieger  Pub.  Co., 

Huntington,  N.Y.,  1980. 

[53]  Henrici,  P.,  Applied  and  Computational  Complex  Analysis ,  vol.  1,  J.  Wiley  & 

Sons,  New  York,  1974. 

[54]  Hille,  E.,  Ordinary  Differential  Equations  in  the  Complex  Domain ,  John  Wiley  & 

Sons,  New  York,  1976. 

[55]  Hobson,  E.W.,  The  Theory  of  Functions  of  a  Real  Variable  and  the  Theory  of 

Fourier’s  Series ,  Dover  Publ.,  New  York,  1957. 

[56]  Hopf,  E.,  The  partial  differential  equation  ut  +  uux  =  fiu,  Commun.  Pure  Appl. 

Math.  3  (1950),  201-230. 

[57]  Howison,  S.,  Practical  Applied  Mathematics:  Modelling,  Analysis,  Approximation , 

Cambridge  University  Press,  Cambridge,  2005. 

[58]  Hydon,  P.  E.,  Symmetry  Methods  for  Differential  Equations ,  Cambridge  Texts  in 

Appl.  Math.,  Cambridge  University  Press,  Cambridge,  2000. 

[59]  Ince,  E.L.,  Ordinary  Differential  Equations ,  Dover  Publ.,  New  York,  1956. 

[60]  Iserles,  A.,  A  First  Course  in  the  Numerical  Analysis  of  Differential  Equations , 

Cambridge  University  Press,  Cambridge,  1996. 

[61]  Jost,  J.,  Partial  Differential  Equations ,  Graduate  Texts  in  Mathematics,  vol.  214, 

Springer- Verlag,  New  York,  2007. 

[62]  Kamke,  E.,  Differentialgleichungen  Losungsmethoden  und  Losungen ,  vol.  1,  Chelsea, 

New  York,  1971. 

[63]  Keller,  H.B.,  Numerical  Methods  for  Two-Point  Boundary -Value  Problems ,  Blaisdell, 

Waltham,  MA,  1968. 

[64]  Knobel,  R.,  An  Introduction  to  the  Mathematical  Theory  of  Waves ,  American 

Mathematical  Society,  Providence,  RI,  2000. 

[65]  Korteweg,  D.J.,  and  de  Vries,  G.,  On  the  change  of  form  of  long  waves 

advancing  in  a  rectangular  channel,  and  on  a  new  type  of  long  stationary 
waves,  Phil.  Mag.  (5)  39  (1895),  422-443. 

[66]  Landau,  L.  D.,  and  Lifshitz,  E.M.,  Quantum  Mechanics  (N on-relativistic  Theory ), 

Course  of  Theoretical  Physics,  vol.  3,  Pergamon  Press,  New  York,  1977. 

[67]  Levine,  I.N.,  Quantum  Chemistry ,  5th  ed.,  Prentice-Hall,  Inc.,  Upper  Saddle 

River,  N.J.,  2000. 


592 


References 


[68]  Lighthill,  M.J.,  Introduction  to  Fourier  Analysis  and  Generalised  Functions , 

Cambridge  University  Press,  Cambridge,  1970. 

[69]  Lin,  C.C.,  and  Segel,  L.  A.,  Mathematics  Applied  to  Deterministic  Problems  in  the 

Natural  Sciences ,  SIAM,  Philadelphia,  1988. 

[70]  McOwen,  R.  C.,  Partial  Differential  Equations :  Methods  and  Applications , 

Prentice-Hall,  Inc.,  Upper  Saddle  River,  N.J.,  2002. 

[71]  Merton,  R.  C.,  Theory  of  rational  option  pricing,  Bell  J.  Econ.  Management  Sci.  4 

(1973),  141-183. 

[72]  Messiah,  A.,  Quantum  Mechanics ,  John  Wiley  &  Sons,  New  York,  1976. 

[73]  Miller,  W.,  Jr.,  Symmetry  and  Separation  of  Variables ,  Encyclopedia  of 

Mathematics  and  Its  Applications,  vol.  4,  Addison- Wesley  Publ.  Co.,  Reading, 
Mass.,  1977. 

[74]  Milne-Thompson,  L.  M.,  The  Calculus  of  Finite  Differences.  Macmillan  and  Co., 

Ltd.,  London,  1951. 

[75]  Misner,  C.W.,  Thorne,  K.  S.,  and  Wheeler,  J.A.,  Gravitation ,  W.  H.  Freeman,  San 

Francisco,  1973. 

[76]  Miura,  R.  M.,  Gardner,  C.S.,  and  Kruskal,  M.D.,  Korteweg-de Vries  equation 

and  generalizations.  II.  Existence  of  conservation  laws  and  constants  of  the 
motion,  J.  Math.  Phys.  9  (1968),  1204-1209. 

[77]  Moon,  F.C.,  Chaotic  Vibrations ,  John  Wiley  &  Sons,  New  York,  1987. 

[78]  Moon,  P.,  and  Spencer,  D.E.,  Field  Theory  Handbook ,  Springer- Verlag,  New  York, 

1971. 

[79]  Morse,  P.  M.,  and  Feshbach,  H.,  Methods  of  Theoretical  Physics ,  McGraw-Hill,  New 

York,  1953. 

[80]  Morton,  K.W.,  and  Mayers,  D.F.,  Numerical  Solution  of  Partial  Differential 

Equations ,  2nd  ed.,  Cambridge  University  Press,  Cambridge,  2005. 

[81]  Murray,  J.D.,  Mathematical  Biology ,  3rd  ed.,  Springer- Verlag,  New  York, 

2002-2003. 

[82]  Oberhettinger,  F.,  Tables  of  Fourier  Transforms  and  Fourier  Transforms  of 

Distributions ,  Springer- Verlag,  New  York,  1990. 

[83]  0ksendal,  B.,  Stochastic  Differential  Equations:  An  Introduction  with  Applications , 

Springer- Verlag,  New  York,  1985. 

[84]  Okubo,  A.,  Diffusion  and  Ecological  Problems:  Mathematical  Models , 

Springer- Verlag,  New  York,  1980. 

[85]  Olver,  F.W.  J.,  Asymptotics  and  Special  Functions ,  Academic  Press,  New  York, 

1974. 

[86]  Olver,  F.W.J.,  Lozier,  D.W.,  Boisvert,  R.F.,  and  Clark,  C.W.,  eds.,  NIST 

Handbook  of  Mathematical  Functions ,  Cambridge  University  Press,  Cambridge, 
2010. 

[87]  Olver,  P.  J.,  Applications  of  Lie  Groups  to  Differential  Equations ,  2nd  ed., 

Graduate  Texts  in  Mathematics,  vol.  107,  Springer- Verlag,  New  York,  1993. 

[88]  Olver,  P.  J.,  Dispersive  quantization,  Amer.  Math.  Monthly  117  (2010),  599-610. 

[89]  Olver,  P.  J.,  and  Shakiban,  C.,  Applied  Linear  Algebra ,  Prentice-Hall,  Inc.,  Upper 

Saddle  River,  N.J.,  2005. 

[90]  Oskolkov,  K.I.,  A  class  of  I.  M.  Vinogradov’s  series  and  its  applications  in 

harmonic  analysis,  in:  Progress  in  Approximation  Theory  ,  Springer  Ser. 
Comput.  Math.,  19,  Springer,  New  York,  1992,  pp.  353-402. 

[91]  Pinchover,  Y.,  and  Rubinstein,  J.,  An  Introduction  to  Partial  Differential 

Equations ,  Cambridge  University  Press,  Cambridge,  2005. 

[92]  Pinsky,  M.A.,  Partial  Differential  Equations  and  Boundary -Value  Problems  with 

Applications ,  3rd  ed.,  McGraw-Hill,  New  York,  1998. 


References 


593 


[93] 

[94] 

[95] 

[96] 

[97] 

[98] 

[99] 

[100] 

[101] 

[102] 

[103] 

[104] 

[105] 

[106] 

[107] 

[108] 

[109] 

[110] 
[111 
[112] 

[113] 

[114] 

[115] 

[116] 

[117] 

[118] 


Polyanin,  A.  D.,  and  Zaitsev,  V.  F.,  Handbook  of  Exact  Solutions  for  Ordinary 
Differential  Equations ,  2nd  ed.,  Chapman  &  Hall/CRC,  Boca  Raton,  FL, 

2003. 

Press,  W.  H.,  Teukolsky,  S.A.,  Vetterling,  W.  T.,  and  Flannery,  B.P.,  Numerical 
Recipes :  The  Art  of  Scientific  Computing ,  3rd  ed.,  Cambridge  University 
Press,  Cambridge,  2007. 

Reed,  M.,  and  Simon,  B.,  Methods  of  Modern  Mathematical  Physics ,  Academic 
Press,  New  York,  1972. 

Royden,  H.  L.,  and  Fitzpatrick,  P.  M.,  Real  Analysis ,  4th  ed.,  Pearson  Education 
Inc.,  Boston,  MA,  2010. 

Rudin,  W.,  Principles  of  Mathematical  Analysis ,  3rd  ed.,  McGraw-Hill,  New  York, 
1976. 

Rudin,  W.,  Real  and  Complex  Analysis ,  3rd  ed.,  McGraw-Hill,  New  York,  1987. 

Salsa,  S.,  Partial  Differential  Equations  in  Action :  From  Modelling  to  Theory , 
Springer- Verlag,  New  York,  2008. 

Sapiro,  G.,  Geometric  Partial  Differential  Equations  and  Image  Analysis , 

Cambridge  University  Press,  Cambridge,  2001. 

Schrodinger,  E.,  Collected  Papers  on  Wave  Mechanics ,  Chelsea  Publ.  Co.,  New 
York,  1982. 

Schumaker,  L.  L.,  Spline  Functions :  Basic  Theory ,  John  Wiley  &  Sons,  New  York, 
1981. 

Schwartz,  L.,  Theorie  des  distributions ,  Hermann,  Paris,  1957. 

Scott  Russell,  J.,  On  waves,  in:  Report  of  the  Meeting ,  British  Assoc.  Adv. 

Sci.,  1845,  pp.  311-390. 

Sethares,  W.  A.,  Tuning,  Timbre,  Spectrum,  Scale ,  Springer- Verlag,  New  York, 

1999. 

Siegel,  C.L.,  Uber  einige  Anwendungen  diophantischer  Approximationen,  in: 
Gesammelte  Abhandlungen ,  vol.  1,  Springer- Verlag,  New  York,  1966,  pp. 
209-266. 

Smoller,  J.,  Shock  Waves  and  Reaction-Diffusion  Equations ,  2nd  ed., 

Springer- Verlag,  New  York,  1994. 

Stewart,  J.,  Calculus:  Early  Trans cendentals,  vols.  1  &  2,  7th  ed.,  Cengage 
Learning,  Mason,  OH,  2012. 

Stokes,  G.  G.,  On  a  difficulty  in  the  theory  of  sound,  Phil.  Mag.  33(3)  (1848), 
349-356. 

Stokes,  G.  G.,  Mathematical  and  Physical  Papers ,  Cambridge  University  Press, 
Cambridge,  1880-1905. 

Stokes,  G.  G.,  Mathematical  and  Physical  Papers ,  2nd  ed.,  Johnson  Reprint  Corp., 
New  York,  1966. 

Strang,  G.,  Introduction  to  Applied  Mathematics ,  Wellesley  Cambridge  Press, 
Wellesley,  Mass.,  1986. 

Strang,  G.,  and  Fix,  G.J.,  An  Analysis  of  the  Finite  Element  Method , 
Prentice-Hall,  Inc.,  Englewood  Cliffs,  N.J.,  1973. 

Strauss,  W.  A.,  Partial  Differential  Equations:  An  Introduction ,  John  Wiley  & 

Sons,  New  York,  1992. 

Thaller,  B.,  Visual  Quantum  Mechanics ,  Springer- Verlag,  New  York,  2000. 

Thijssen,  J.,  Computational  Physics ,  Cambridge  University  Press,  Cambridge,  1999. 

Titchmarsh,  E.  C.,  Theory  of  Functions ,  Oxford  University  Press,  London,  1968. 

Varga,  R.  S.,  Matrix  Iterative  Analysis ,  2nd  ed.,  Springer- Verlag,  New  York,  2000. 


594 


References 


[119] 

[120] 

[121] 

[122] 

[123] 

[124] 

[125] 


[126] 

[127] 

[128] 


Watson,  G.N.,  A  Treatise  on  the  Theory  of  Bessel  Functions ,  Cambridge 
University  Press,  Cambridge,  1952. 

Weinberger,  H.F.,  A  First  Course  in  Partial  Differential  Equations ,  Dover  Publ., 
New  York,  1995. 

Wiener,  N.,  I  Am  a  Mathematician ,  Doubleday,  Garden  City,  N.Y.,  1956. 

Whitham,  G.B.,  Linear  and  Nonlinear  Waves ,  John  Wiley  &  Sons,  New  York, 
1974. 

Wilmott,  P.,  Howison,  S.,  and  Dewynne,  J.,  The  Mathematics  of  Financial 
Derivatives ,  Cambridge  University  Press,  Cambridge,  1995. 

Yong,  D.,  Strings,  chains,  and  ropes,  SIAM  Review  48  (2006),  771-781. 

Zabusky,  N.  J.,  and  Kruskal,  M.D.,  Interaction  of  “solitons”  in  a  collisionless 
plasma  and  the  recurrence  of  initial  states,  Phys.  Rev.  Lett.  15  (1965), 
240-243. 

Zienkiewicz,  O.C.,  and  Taylor,  R.  L.,  The  Finite  Element  Method ,  4th  ed., 
McGraw-Hill,  New  York,  1989. 

Zwillinger,  D.,  Handbook  of  Differential  Equations ,  Academic  Press,  Boston,  1992. 

Zygmund,  A.,  Trigonometric  Series ,  3rd  ed.,  Cambridge  University  Press, 
Cambridge,  2002. 


Symbol  Index 


Symbol 

Meaning 

Page(s) 

c  d 

addition  of  scalars 

575 

z  +  w 

complex  addition 

571 

A  +  B 

addition  of  matrices 

575 

V  +  w 

addition  of  vectors 

575 

f  +  9 

addition  of  functions 

575 

zw 

complex  multiplication 

571 

z/w 

complex  division 

572 

cv, 

cA,  cf 

scalar  multiplication 

575 

z 

complex  conjugate 

571 

n 

closure  of  subset  or  domain 

243 

0 

zero  vector 

xvi,  575 

>  0 

positive  definite 

355,  578 

>  0 

positive  semi-definite 

355 

r1 

inverse  function 

xvi 

A-1 

inverse  matrix 

xvi 

/  0 

+  ),  fix  ) 

one-sided  limits 

xvi 

n! 

factorial 

163,  453 

(; 

binomial  coefficient 

163 

\  / 

absolute  value,  modulus 

94,  225,  571 

• 

norm 

73,89,106,284,356, 

578,579,581 

• 

double  norm 

380 

• 

norm 

356 

v  •  w 

dot  product 

578 

z  •  w 

Hermitian  dot  product 

580 

(•> 

expected  value 

287 

<■> 

•> 

inner  product 

73,89,107, 285,341, 
578,579,581 

«•, 

•)) 

inner  product 

341 

[0,1] 

closed  interval 

xvi 

{/ 

1 

set 

xvi 

e 

element  of 

xvi 

P  J.  Olver,  Introduction  to  Partial  Differential  Equations,  Undergraduate  Texts  in  Mathematics, 
DOI  10.1007/978-3-319-02099-0,  ©  Springer  International  Publishing  Switzerland  2014 


595 


596 


Symbol  Index 


not  element  of 

xvi 

C,  C 

subset 

xvi 

u 

union 

xvi 

n 

intersection 

xvi 

\ 

set  theoretic  difference 

xvi 

• — 

definition  of  symbol 

xvi 

— 

identical  equality  of  functions 

xvi 

— 

equivalence  in  modular  arithmetic 

xvi 

o 

composition 

xvi 

* 

convolution 

95,  281 

L* 

adjoint  operator 

341 

r^j 

Fourier  series  representation 

74 

asymptotic  equality 

300 

f-X—^Y 

function 

xvi 

xn  — )►  X 

convergent  sequence 

xvi 

fn^f 

weak  convergence 

230 

f(x+),  f(x~) 

one-sided  limits 

41,  79 

u'  ,u" ,  .  .  . 

space  derivatives 

xvii 

u,  ii, . . . 

time  derivatives 

xvii 

1  ^tx")  ’  "  ' 

du  d2u 

dx  ’  dx2  ’ 
d 

d 

du  d2u  d2u 

partial  derivatives 

ordinary  derivatives 

partial  derivative 
boundary  of  domain 

partial  derivatives 

xvii,  1 

xvii,  1 

xvii,  1 

5,  152,  504 

xvii,  1 

dx  ’  dx 2  ’  dt  dx 

d  9 

x  ’  dx 

partial  derivative  operator 

2 

d 

dxi 

normal  derivative 

153,  244,  504 

V 

gradient 

150,  242,  345,  505 

V- 

divergence 

242,  347,  505 

Vx 

curl 

242 

V2 

Laplacian 

243 

□ 

Tl 

wave  operator 

50 

E 

i  —  1 

summation 

xvi 

J  f(x)  dx 

indefinite  integral 

xvii 

fb 

/  /(x)  dx 

definite  integral 

xvii 

a 


Symbol  Index 


597 


oo 


+  f(x)  dx 

J  —  oo 

principal  value  integral 

283 

//  f(x,y)dxdy 

J  Jn 

double  integral 

243 

/  /  /  z)  dx  dy  dz  triple  integral 

J  J  Jn 

505 

[  f(s)ds 

Jc 

line  integral  with  respect  to  arc  length 

244 

/  v  dx 

Jc 

line  integral 

243 

(j)  v  dx 

line  integral  around  closed  curve 

243 

[  j  fdS 

surface  integral 

505 

J  Jan 

a 

Bohr  radius 

567 

A 

space  of  analytic  functions 

576 

ak 

Fourier  coefficient 

74,  89 

Ai 

Airy  function 

327,  460 

arg 

argument  (see  phase) 

xvi,  573 

b 

finite  element  vector 

401 

B 

magnetic  held 

551 

bk 

Fourier  coefficient 

74,  89 

Bi 

Airy  function  of  the  second  kind 

462 

c 

wave  speed 

19,  24,  50,  486,  546 

c 

finite  element  coefficient  vector 

401 

€ 

complex  numbers 

xv,  571 

C9 

group  velocity 

331 

Ck 

complex  Fourier  coefficient 

89 

Ck 

eigenfunction  series  coefficient 

378 

*'p 

phase  velocity 

330 

c° 

space  of  continuous  functions 

108,  576 

cn 

space  of  differentiable  functions 

5,  576 

c°° 

space  of  smooth  functions 

576 

<cn 

n-dimensional  complex  space 

xv,  575 

coker 

cokernel 

350 

cos 

cosine 

6,  89 

cosh 

hyperbolic  cosine 

88 

coth 

hyperbolic  cotangent 

91,  317 

CSC 

cosecant 

230 

curl 

curl  (see  also  V  x ) 

242 

d 

ordinary  derivative 

xvii,  1 

D 

derivative  operator 

342,  585 

D 

domain 

5 

598 


Symbol  Index 


det 

determinant 

582 

dim 

dimension 

577 

div 

divergence  (see  also  V-) 

242 

ds 

arc  length  element 

244 

dS 

surface  area  element 

505 

e 

base  of  natural  logarithm 

xvi 

E 

energy 

61,  132,  151 

E 

electric  held 

551 

e* 

exponential 

5 

ez 

complex  exponential 

573 

standard  basis  vector 

216, 

577 

erf 

error  function 

55 

erfc 

complementary  error  function 

302 

/ 

periodic  extension 

77 

j= 

function  space 

575 

T 

Fourier  transform 

264 

inverse  Fourier  transform 

265 

F(t, 

x\  £)  fundamental  solution 

292, 

387,  481 

G(x] 

£),  G^(x)  Green’s  function 

234, 

240,  248 

G(t, 

X]t,£)  general  fundamental  solution 

297 

h 

step  size 

182 

h 

Planck’s  constant 

6,  287,  394 

Hn 

Hermite  polynomial 

311 

TTUl 

Hn  > 

H harmonic  polynomial 

520 

i  =  y— 1  imaginary  unit 

571 

i 

identity  matrix 

575 

Im 

imaginary  part 

571 

^ m 

Bessel  function 

468 

k 

frequency  variable 

264 

k 

wave  number 

330 

K 

finite  element  matrix 

401 

K[u 

right  hand  side  of  evolution  equation 

291 

kv 

elemental  stiffness 

417 

Tsvn 
Kn  i 

K ™  complementary  harmonic  function 

523 

ker 

kernel 

350, 

577 

i 

angular  quantum  number 

568 

L2 

Hilbert  space 

106, 

284 

Lk 

Laguerre  polynomial 

566 

Li 

generalized  Laguerre  polynomial 

566 

L[u 

linear  function/operator 

10,  64,  585 

lim 

,  lim  limits 

xvi 

X 


a 


n 


oo 


Symbol  Index 


599 


lim  ,  lim 

x  — »  a~  x  — >  a+ 

one-sided  limits 

xvi 

log 

natural  or  complex  logarithm 

xvi,  573 

m 

mass 

6 

m 

magnetic  quantum  number 

568 

M 

electron  mass 

564 

Mr,  M* 

spherical  mean 

553 

max 

maximum 

xvi 

min 

minimum 

xvi 

mod 

modular  arithmetic 

xvi 

n 

principal  quantum  number 

568 

n 

unit  normal 

153,  244,  505 

N 

natural  numbers 

XV 

0 

zero  matrix 

575 

o(/0 

Big  Oh  notation 

182 

p 

pressure 

3 

p 

option  exercise  price 

299 

p 

Peclet  number 

311 

p 

n 

Legendre  polynomial 

511,  525 

Pm 

Jr n 

trigonometric  Ferrers  function 

515 

pm 
r  n 

Ferrers  (associated  Legendre)  function 

513 

p(n) 

space  of  polynomials  of  degree  <  n 

577 

ph 

phase  (argument) 

xvi,  572 

Q[u] 

quadratic  function(al) 

362 

r 

radial  coordinate 

xv,  3,  160,  572 

r 

cylindrical  radius 

xv,  3,  508 

r 

spherical  radius 

xv,  3,  508 

r 

interest  rate 

299 

R 

real  numbers 

XV 

Mn 

n-dimensional  Euclidean  space 

xv,  575 

R[u 

Rayleigh  quotient 

375 

Re 

real  part 

571 

mg 

range 

576 

8 

arc  length 

244 

s 

surface  area 

505 

Q 

^  rn 

spherical  Bessel  function 

539 

partial  sum 

75, 113 

n  ox 

K—/  y*  ^  ^ 

sphere  of  radius  r 

553,  555 

sech 

hyperbolic  secant 

334 

sign 

sign  function 

94,  225 

sin 

sine 

6,  89 

sinh 

hyperbolic  sine 

13,  88 

600 


Symbol  Index 


span 

span 

576 

supp 

support 

407 

t 

time 

xv,  3 

T 

conserved  density 

38,  256 

At 

transpose  of  matrix 

341,  578 

T, 

finite  element  triangle 

411 

tan 

tangent 

1 

tanh 

hyperbolic  tangent 

135 

u 

dependent  variable 

xv,  3 

Ux">  Uxxi  •  •  • 

partial  derivative 

1 

V 

dependent  variable 

xv,  3 

V 

eigenvector  /  eigenfunction 

371 

V 

vector 

xv,  575 

V 

eigenvector 

66,  582 

V 

vector  field 

3,  242 

V 

vector  space 

575 

V 

potential  function 

6 

perpendicular  vector 

244 

11 

Imn 

atomic  eigenfunction 

568 

Vx 

eigenspace 

371 

W 

dependent  variable 

xv,  3 

w 

heat  flux 

122 

w 

heat  flux  vector 

437 

X 

Cartesian  space  coordinate 

xv,  3,  152,  504 

X 

real  part  of  complex  number 

571 

X 

flux 

38,  256 

y 

Cartesian  space  coordinate 

xv,  3,  152,  504 

y 

imaginary  part  of  complex  number 

571 

Y 

flux 

256 

Y 

m 

Bessel  function  of  the  second  kind 

470 

yrri  ym 

1  n  i  1  n 

spherical  harmonic 

517 

Aim 

y  n 

complex  spherical  harmonic 

519 

Z 

Cartesian  space  coordinate 

xv,  3,  504 

Z 

cylindrical  coordinate 

xv,  3,  508 

z 

complex  number 

571 

z 

integers 

XV 

a 

electron  charge 

564 

A" 

radial  wave  function 

568 

7 

thermal  diffusivity 

124,  438,  535 

7 

Euler-Mascheroni  constant 

471 

r 

gamma  function 

454 

Symbol  Index 


601 


<5, 

delta  function 

217,219,246,247,527 

S 

periodically  extended  delta  function 

229 

s',  y 

derivative  of  delta  function 

225,  226 

A 

Laplacian 

4,152,161,243, 

504, 509 

A 

discriminant 

172,  173 

Ax 

step  size 

186 

Ax 

variance 

287 

A s 

spherical  Laplacian 

509 

£ 

thermal  energy  density 

122,  437 

eo 

permittivity  constant 

551 

Cn,n 

Bessel  root 

474 

T] 

characteristic  variable 

51 

e 

polar  angle 

xv,  3,  160,  572 

9 

cylindrical  angle 

xv,  3,  508 

9 

azimuthal  angle 

xv,  3,  508 

C 

root  of  unity 

582 

hi 

thermal  conductivity 

65,  123,  437 

hi 

stiffness  or  tension 

49 

A 

eigenvalue 

66,  371,  573 

A 

magnification  factor 

189 

Mo 

permeability  constant 

551 

V 

viscosity 

3 

i 

characteristic  variable 

19,  25,  32,  51 

7T 

area  of  unit  circle 

5 

P 

density 

49,  122,  438 

P 

spectral  radius 

584 

Pi  P £ 

ramp  function 

91,  223 

Pni  Pn,£ 

nth  order  ramp  function 

95,  223 

Pm,n 

relative  vibrational  frequency 

495 

<J 

shock  position 

41 

(7 

heat  capacity 

65,  122,  438 

<J 

volatility 

299 

a, 

unit  step  function 

61,  80,  222 

® m,n 

spherical  Bessel  root 

540 

P> 

zenith  angle 

xv,  3,  508 

P> 

wave  function 

286 

Vk 

orthogonal  or  orthonormal  system 

109 

Vk 

basis  for  finite  element  subspace 

401 

X 

specific  heat  capacity 

122,  431 

Xd 

characteristic  function 

485 

Symbol  Index 


time-dependent  wave  function 

frequency 

domain 


394,  564 
59,  330 
152,  242,  504 


Author  Index 


Abdulloev,  K.  0.  337,  [1] 

Ablowitz,  M.  J.  283,  292,  324,  333,  337, 
338,  [2] 

Abraham,  R.  161,  [3] 

Airy,  G.  B.  281,  327,  334,  [4] 

Aki,  K.  549,  [5] 

Ames,  W.F.  181,  213,  400,  410,  [6] 
Antman,  S.  S.  324,  486,  549,  [7] 
Apostol,  T.  M.  5,  20,  76,  87,  100,  105, 
169,  182,  236,  242,  245,  267,  312, 
437,  500,  505,  [8],  [9] 

Atkinson,  K.  519,  [10] 

Bank,  S.B.  455,  [11] 

Batchelor,  G.  K.  3,  [12 
Bateman,  H.  315,  [13] 

Benjamin,  T.  B.  337,  [14] 

Berest,  Y.  563,  [15] 

Bernoulli,  J.  xvii,  452 
Berry,  M.  V.  329,  [16] 

Bessel,  F.W.  Ill,  452 
Birkhoff,  G.  ix,  2,  11,  29,  67,  298,  305, 
309,  457,  [17],  [18] 

Black,  F.  299,  [19] 

Blanchard,  P.  ix,  2,  11,  22,  25,  29,  65, 

68,  [20] 

Bogolubsky,  I.  L.  337,  [1] 

Bohr,  N.  H.  D.  567 
Boisvert,  R.  F.  xvi,  55,  310,  327,  364, 
435,  452,  468,  511,  512,  513,  567, 
573,  [86] 

Bona,  J.  L.  337,  [14] 

Bourget,  J.  490 

Boussinesq,  J.  292,  333,  335,  [21],  [22 
Boyce,  W.  E.  ix,  2,  11,  22,  25,  29,  65, 
67,  68,  162,  169,  263,  298,  300, 
309,  466,  [23 

Bradie,  B.  ix,  135,  185,  407,  453,  [24 
Bronstein,  M.  267,  [25] 

Burgers,  J.  M.  315,  [26 

Cantor,  G.  F.  L.  P.  xvii,  64 
Cantwell,  B.  J.  305,  [27] 

Carleson,  L.  117,  [28] 

Carmichael,  R.  500,  [29 
Cauchy,  A.  L.  xvii,  175,  215 
Chen,  G.  329,  [30] 

Chladni,  E.  F.  F.  497 


Clark,  C.  W.  xvi,  55,  310,  327,  364,  435, 
452,  468,  511,  512,  513,  567,  573, 

[86] 

Clarkson,  P.  A.  283,  292,  324,  333,  337, 
338,  [2] 

Coddington,  E.  A.  2,  [31 
Cole,  J.D.  318,  [32] 

Courant,  R.  xvii,  4,  176,  177,  179,  197, 
246,  255,  340,  377,  436,  440,  449, 
477,  497,  541,  [33],  [34],  [35] 
Crank,  J.  192 

d’Alembert,  J.  L.  R.  xvii,  15,  50,  140, 
149,  558 

de  Broglie,  L.  V.  P.  R.  287 
de  Coulomb,  C.-A.  252,  503 
Devaney,  R.  L.  ix,  2,  11,  22,  25,  29,  65, 
68,  [20] 

de  Vries,  G.  333,  [65] 

Dewynne,  J.  299,  [123] 

DiPrima,  R.  C.  ix,  2,  11,  22,  25,  29,  65, 
67,  68,  162,  169,  263,  298,  300, 
309,  466,  [23 
Dirac,  P.  A.  M.  217 
Dirichlet,  J.  P.  G.  L.  7,  368 
Drazin,  P.  G.  38,  292,  324,  333,  337, 

338,  [36] 

du  Bois-Reymond,  P.  D.  G.  430 
Duhamel,  J.  M.  C.  298 
Dym,  H.  76,  99,  107,  115,  117,  263,  265, 
275,  286,  344,  [37] 

Einstein,  A.  19,  31,  63,  149,  504 
Euler,  L.  xvii,  3,  454,  461,  573 
Evans,  L.  C.  xvii,  4,  314,  340,  350,  427, 
436,  535,  546,  [38] 

Feller,  W.  55,  295,  [39] 

Fermi,  E.  333,  [40] 

Ferrers,  N.  M.  513 

Feshbach,  H.  170,  508,  569,  [79] 

Fitzpatrick,  P.  M.  76,  100,  102,  107, 

108,  119,  217,  219,  344,  [96] 

Fix,  G.  J.  399,  400,  410,  431,  [113] 
Flannery,  B.  P.  ix,  135,  181,  [94] 
Forsyth,  A.  R.  317,  [41] 

Fourier,  J.  xvii,  63,  64,  71,  114,  123, 

149,  437,  452,  535,  [42] 

Fredholm,  E.  I.  350 


P  J.  Olver,  Introduction  to  Partial  Differential  Equations,  Undergraduate  Texts  in  Mathematics, 
DOI  104007/978-3-319-02099-0,  ©  Springer  International  Publishing  Switzerland  2014 


603 


604 


Author  Index 


Friedrichs,  K.  O.  197,  [33] 

Frobenius,  F.  G.  464 

Galilei,  G.  20 
Gander,  M.  J.  497,  [43] 

Garabedian,  P.  xvii,  4,  173,  176,  246, 
340,  350,  376,  377,  427,  [44] 
Gardner,  C.  S.  336,  338,  [45],  [76] 
Gauss,  J.  C.  F.  63 
Germain,  M.-S.  497 
Gibbs,  J.  W.  84 
Gonzalez,  R.  C.  442,  [46] 

Gordon,  C.  487,  [47] 

Gradshteyn,  I.  S.  334,  [48] 

Green,  G.  3,  215,  243 
Greene,  J.  M.  336,  [45] 

Gregory,  J.  78 

Gurtin,  M.  E.  256,  549,  [49] 

Haberman,  R.  xvii,  [50] 

Hairer,  E.  181,  [51] 

Hale,  J.K.  ix,  2,  11,  29,  67,  [52] 

Hall,  G.  R.  ix,  2,  11,  22,  25,  29,  65,  68, 

[20] 

Han,  W.  519,  [10] 

Heaviside,  O.  215,  217,  551 
Heisenberg,  W.  K.  286,  288 
Henrici,  P.  256,  [53] 

Hilbert,  D.  xvii,  4,  106,  176,  177,  179, 
246,  255,  340,  368,  377,  436,  440, 
449,  477,  497,  541,  [34],  [35] 

Hille,  E.  ix,  2,  453,  457,  459,  463,  [54] 
Hobson,  E.  W.  329,  [55] 

Hopf,  E.  318,  [56] 

Howison,  S.  vii,  viii,  43,  46,  299,  [57], 

[1.23] 

Hugoniot,  P.  H.  40 
Huygens,  C.  560 
Hydon,  P.  E.  305,  [58] 

Ince,  E.  L.  ix,  2,  453,  457,  459,  463,  472, 

[59] 

Iserles,  A.  ix,  181,  185,  453,  [60] 

Johnson,  R.  S.  38,  292,  324,  333,  337, 
338,  [36] 

Jost,  J.  xvii,  4,  246,  255,  314,  340,  350, 
427,  535,  546,  [61] 

Kamke,  E.  453,  [62] 

Kaufman,  R.  P.  455,  [11 
Keller,  H.B.  185,  355,  364,  [63] 

Kelvin,  L.  (Thomson,  W.)  3,  41,  139, 
331 

Kirchhoff,  G.  R.  503,  558 
Knobel,  R.  317,  [64] 

Korteweg,  D.  J.  333,  [65] 

Kovalevskaya,  S.  V.  175 


Kruskal,  M.D.  333,  336,  338,  [45],  [76], 

[125 

Kwok,  F.  497,  [43] 

Lagrange,  J.-L.  xvii,  182 

Laguerre,  E.  N.  566 

Landau,  L.  D.  108,  278,  288,  383,  394, 

[66] 

Laplace,  P.-S.  xvii,  152 
Lebesgue,  H.  L.  107,  112 
Legendre,  A.-M.  454,  511 
Leibniz,  G.W.  1,78 
Levine,  I.  N.  569,  [67] 

Levinson,  N.  2,  [31] 

Lewy,  H.  197,  [33] 

Lie,  M.  S.  305 

Lifshitz,  E.  M.  108,  278,  288,  383,  394, 

[66] 

Lighthill,  M.  J.  76,  233,  263,  286,  [68] 
Lin,  C.  C.  vii,  viii,  [69] 

Liouville,  J.  363 

Lozier,  D.W.  xvi,  55,  310,  327,  364, 

435,  452,  468,  511,  512,  513,  567, 
573,  [86] 

Lubich,  C.  181,  [51] 

Mahony,  J.  J.  337,  [14] 

Makhankov,  V.  G.  337,  [1] 

Marsden,  J.E.  161,  [3] 

Marzoli,  I.  329,  [16] 

Maxwell,  J.  C.  551 
Mayers,  D.F.  181,  200,  213,  453,  [80] 
McKean,  H.  P.  76,  99,  107,  115,  117, 
263,  265,  275,  286,  344,  [37] 
McOwen,  R.  C.  xvii,  56,  122,  179,  246, 
255,  [70] 

Mendeleev,  D.  I.  568 
Merton,  R.  C.  299,  [71] 

Messiah,  A.  108,  278,  288,  383,  394, 

565,  [72] 

Miller,  W.,  Jr.  305,  [73] 
Milne-Thompson,  L.  M.  185,  [74] 
Minkowski,  H.  56,  560 
Misner,  C.  W.  56,  504,  [75] 

Miura,  R.  M.  336,  338,  [45],  [76] 

Moon,  F.  C.  60,  [77] 

Moon,  P.  170,  508,  [78] 

Morse,  P.  M.  170,  508,  569,  [79] 

Morton,  K.  W.  181,  200,  213,  453,  [80] 
Murray,  J.  D.  438,  [81] 

Navier,  C.  L.  M.  H.  3 
Neumann,  C.  G.  7 
Newton,  I.  vii,  49,  63,  135,  149,  182, 
252,  388,  503 
Nicolson,  P.  192 


Subject  Index 


605 


Oberhettinger,  F.  273,  [82] 

Okubo,  A.  438,  [84] 

Olver,  F.W.J.  xvi,  55,  310,  327,  331, 
364,  435,  452,  453,  461,  462,  468, 
470,  472,  474,  511,  512,  513,  567, 
573,  [85],  [86] 

Olver,  P.  J.  ix,  xi,  xv,  xviii,  11,  27,  38, 
39,  65,  66,  67,  73,  75,  98,  110, 

162,  191,  210,  212,  300,  305,  308, 
310,  311,  329,  338,  350,  357,  363, 
372,  378,  390,  400,  402,  407,  408, 
411,  575,  579,  581,  582,  583,  584, 
585,  [30],  [87],  [88],  [89] 

Oskolkov,  K.  I.  332,  [90] 

Oksendal,  B.  viii,  299,  [83] 

Parseval  des  Chenes,  M.-A.  114 

Pasta,  J.  333,  [40] 

Pauli,  W.  E.  568 
Pinchover,  Y.  xvii,  [91] 

Pinsky,  M.  A.  xvii,  [92] 

Plancherel,  M.  114 
Planck,  M.  K.  E.  L.  287,  394 
Poisson,  S.D.  31,  152,  503,  558 
Polyanin,  A.  D.  453,  [93] 

Press,  W.  H.  ix,  135,  181,  [94] 

Rankine,  W.  J.  M.  40 
Ratiu,  T.  161,  [3] 

Rayleigh,  L.  (Strutt,  J.W.)  40,  375 
Reed,  M.  344,  374,  383,  565,  [95] 
Richards,  P.  G.  549,  [5] 

Richardson,  L.  F.  194 

Riemann,  G.  F.  B.  xvii,  31,  63,  87,  112 

Robin,  V.  G.  123 

Rota,  G.-C.  ix,  2,  11,  29,  67,  298,  309, 
457,  [18; 

Royden,  H.  L.  76,  100,  102,  107,  108, 
119,  217,  219,  344,  [96] 

Rubinstein,  J.  xvii,  [91] 

Rudin,  W.  5,  76,  100,  102,  105,  107, 

108,  119,  169,  182,  217,  219,  256, 
263,  267,  344,  [97],  [98] 

Ryzhik,  I.  W.  334,  [48] 

Salsa,  S.  xvii,  4,  122,  340,  350,  427,  436, 
535,  546,  [99] 

Sapiro,  G.  442,  [100] 

Schleich,  W.  329,  [16] 

Scholes,  M.  299,  [19] 

Schrodinger,  E.  394,  [101] 

Schumaker,  L.  L.  210,  400,  408,  [102 
Schwartz,  L.  217,  [103] 

Scott  Russell,  J.  334,  [104] 

Segel,  L.  A.  vii,  viii,  [69] 

Sethares,  W.  A.  144,  [105] 


Shakiban,  C.  ix,  xv,  xviii,  11,  27,  65, 

66,  67,  73,  75,  98,  110,  162,  191, 
210,  212,  300,  350,  357,  363,  372, 
378,  390,  400,  402,  407,  408,  411, 
575,  579,  581,  582,  583,  584,  585, 
[89] 

Siegel,  C.  L.  490,  [106] 

Simon,  B.  344,  374,  383,  565,  [95] 
Smoller,  J.  317,  427,  434,  438,  [107] 
Spencer,  D.  E.  170,  508,  [78] 

Stewart,  J.  5,  20,  105,  236,  242,  245, 
312,  407,  437,  505,  [108] 

Stokes,  G.  G.  3,  41,  331,  [109],  [110], 

[111] 

Strang,  G.  xvii,  357,  399,  400,  410,  431, 

[112],  [113] 

Strauss,  W.  A.  xvii,  [114] 

Strutt,  J.W.  see  Rayleigh,  L. 

Sturm,  F.  O.  R.  363 

Talbot,  W.  H.  F.  329 
Taylor,  B.  75 

Taylor,  R.  L.  400,  410,  431,  [126] 
Teukolsky,  S.A.  ix,  135,  181,  [94] 
Thaller,  B.  108,  329,  394,  [115] 
Thijssen,  J.  565,  [116] 

Thomson,  W.  see  Kelvin,  L. 

Thorne,  K.  S.  56,  504,  [75] 

Titchmarsh,  E.  C.  263,  265,  275,  286, 
[117] 

Ulam,  S.  333,  [40] 

Varga,  R.  S.  213,  402,  411,  [118] 
Vetterling,  W.  T.  ix,  135,  181,  [94] 
Victoria,  Q.  139 
von  Helmholtz,  H.  L.  F.  374 
von  Neumann,  J.  190 

Wanner,  G.  181,  [51 

Watson,  G.  N.  452,  470,  472,  474,  490, 

[119 

Webb,  D.L.  487,  [47] 

Weierstrass,  K.  T.  W.  xvii,  100,  329 
Weinberger,  H.  F.  xvii,  50,  447,  525, 

[120] 

Wheeler,  J.  A.  56,  504,  [75] 

Whitham,  G.B.  31,  44,  46,  179,  315, 

316,  317,  324,  331,  427,  434,  [122 
Wiener,  N.  254,  [121] 

Wilmott,  P.  299,  [123] 

Winternitz,  P.  563,  [15] 

Wolpert,  S.  487,  [47] 

Woods,  R.  E.  442,  [46] 

Yong,  D.  50,  [124] 

Zabusky,  N.  J.  333,  [125 


606 


Author  Index 


Zaitsev,  V.  F. 
Zienkiewicz,  ( 
Zwillinger,  D 
Zygmund,  A. 


453,  [93] 

.C.  400,  410,  431,  [126] 

453,  [127] 

76,  99,  102,  105,  115,  117,  [128] 


Subject  Index 


absolute 

convergence  101 
value  86,  94,  105,  225 
zero  139 
abstraction  339 
acceleration  7,  49 
accidental  degeneracy  499 
acoustics  2,  15,  31 
acoustic  wave  15 
Acta  Numerica  181 
addition  571,  575 
identity  575 
inverse  575 

adjoint  339,  341,  344,  428,  438,  505 
formal  344 
system  350 
weighted  342 
advection  322 
aerodynamics  173 
affine  404 
element  414,  416 
piecewise  404,  411 
afterglow  562,  563 
air  15,  174,  551 
airplane  15,  173,  174 
Airy 

differential  equation  281,  459 
function  327,  364,  435,  461 
second  kind  462 
algebra  263,  273,  275,  453 
linear  viii,  ix,  63,  215,  221,  234,  339, 
353,  400,  575 
numerical  linear  ix,  399 
algebraic 

differential  equation  455 
equation  1,  8,  11,  215,  428 
function  511 
multiplicity  373 
algorithm  399 
altitude  vector  418 
amplitude  334 
analysis  63,  76,  99,  400,  578 
complex  31,  175,  256,  263 
functional  340,  350,  362 
numerical  ix,  181 
real  ix,  219 
vector  viii 


analytic  76,  105,  158,  169,  175,  181, 

521,  576 

function  98,  456,  463 
solution  431 
angle  509 

azimuthal  508,  522,  549 
cylindrical  509 
polar  572 
right  73,  581 
zenith  508,  515 
angular 

coordinate  130 
quantum  number  566,  568 
animal  population  2,  435,  485 
anisotropic  442 

annulus  170,  171,  415,  474,  480,  494, 
499 

ansatz  66,  124,  161,  330,  390,  394,  466, 
475,  535,  538 
exponential  67,  330 
power  162,  464,  520 
trigonometric  546 

approximation  vii,  9,  110,  181,  363,  406 

arbitrage  299 

arbitrary  function  6,  21 

arc  551,  556 

arc-length  244 

area  39,  244,  413,  414,  415,  477 
equal  40,  46,  47,  431 
surface  517,  529,  537,  543,  553,  555 
argument  (see  phase)  xvi,  573 
arithmetic  184,  571 
floating-point  ix,  184 
single-precision  184 
asset  299 

associated  Legendre  function  513 
associativity  281,  575 
astronomy  560 
asymptotics  69,  284,  453,  468 
atom  37,  279,  394,  497,  547,  568 
eigenfunction  568,  570 
orbital  503 
audio  63 

autonomous  24,  29,  556 
average  40,  92,  131,  167,  252,  285,  521, 
523,  553,  560 
weighted  213 


P.J.  Olver,  Introduction  to  Partial  Differential  Equations,  Undergraduate  Texts  in  Mathematics, 
DOI  104007/978-3-319-02099-0,  ©  Springer  International  Publishing  Switzerland  2014 


607 


608 


Subject  Index 


axis 

horizontal  18 
vertical  18 

azimuthal  angle  508,  522,  549 

B  spline  283 

back  substitution  212 

backward  difference  182 

backwards  heat  equation  129,  299,  442 

bacteria  438,  485 

ball  362,  467,  508,  528,  530,  533,  534, 
537,  540,  542,  545,  547,  550,  555 
balloon  549 
bank  299 

bar  2,  15,  49,  64,  65,  96,  122,  124,  132, 
134,  138,  140,  144,  234,  239,  241, 
292,  293,  298,  307,  344,  351,  357, 
405,  440 
barge  334 
barrier  15,  173 

basis  ix,  112,  350,  401,  405,  430,  577, 
583 
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bath  123,  134,  154,  436 
beam  2,  146,  393 
equation  146,  396 
light  286 
bell  curve  295 
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bi-affine  function  415 
bidiagonal  211 
bidirectional  5,  324 
big  Oh  182 

biharmonic  equation  361 
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Dirichlet  125,  132,  160,  207,  213, 
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singular  379 
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function  60,  88,  149,  266,  283,  296 
brick  536 
bridge  549 
Brownian  motion  438 
building  549 
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potential  318 
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cable  equation  139,  304 
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of  finite  differences  182,  185 
multivariable  215,  437,  505 
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cap  555,  561 
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capacitor  522,  523,  526,  533 
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problem  175 

-Schwarz  inequality  ix,  107,  175,  285, 
526,  579,  581 
sequence  107,  119 


causality  15,  43,  433 
cavity  524,  534 
CD  63 
ceiling  494 
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Celsius  306 
center  252 

centered  difference  184,  198 
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equation  177 
function  485 
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boundary  value  problem  385 
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diffusion  123 
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circle  167,  258,  450,  500,  551 
civil  engineering  549 
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solution  5,  7,  17,  51,  144,  255,  399, 
410,  427,  428,  432,  535,  546,  556, 
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clockwise  243 
closed  ix 
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surface  505 
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diffusion  129,  307 
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collision  54,  292,  324,  336,  337,  438 
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conditionally  stable  190 
conduction  504 
conducting  medium  504 
conductivity  124 
thermal  65,  123,  437 
conductor  123 
cone  56,  560,  564 
conformal  mapping  256 


conic  172 

conjugate  symmetry  580 
connected  17,  141,  245,  345,  359 
simply  243 

conservation  law  15,  38,  46,  131,  201, 
255,  256,  295,  304,  332,  337,  431, 
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energy  61,  151,  535 
heat  energy  122,  304 
mass  41,  47,  360 

conserved  density  38,  46,  201,  256 
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function  285,  506 
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separation  141,  156,  446,  510 
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continuous  viii,  7,  63,  80,  82,  94,  102, 
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dependence  130 

function  99,  108,  117,  219,  220,  344, 
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Lipschitz  29 
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piecewise  79,  81,  108,  127,  441 
spectrum  337,  340,  374,  383,  565 
continuum  mechanics/physics  viii,  38 
contract  299 
control  9,  263 
convection  438 

-diffusion  equation  139,  311,  314,  438 
convective  flow  311 
convergence  viii,  xvi,  63,  72,  75,  76,  98, 
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ratio  viii,  75,  462,  468 
root  viii,  75 
theorem  82 

uniform  99,  100-102,  104,  378,  519 
weak  99,  230,  270,  327,  429 
convergent  584,  585 
convolution  242,  281,  282,  295,  484,  544 
integral  301 
periodic  95 
summation  284 
theorem  284 
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coordinate 
angular  130 

Cartesian  viii,  242,  446,  536 
characteristic  57 
curvilinear  364,  435 
cylindrical  viii,  3,  503,  508,  523,  527, 
530,  536,  547 
ellipsoidal  508 
moving  15,  19 
parabolic  174 
parabolic  spheroidal  508 
polar  viii,  3,  62,  160,  161,  171,  250, 
383,  451,  477,  479,  490,  509,  572 
rectangular  viii,  161,  503 
separable  508 
space  3 

spherical  viii,  3,  503,  508,  520,  524, 
528,  537,  551,  553,  565 
toroidal  508 
corner  81,  193,  441 
node  209 
corpse  129 
cosine  transform  274 
Coulomb 

potential  503,  529,  564 
problem  564 
counterclockwise  243,  415 
crack  256,  427 
Cramer’s  Rule  413 
Crank-Nicolson  method  192 
crest  331 
critical  point  501 
cross  product  viii 

cube  526,  534,  536,  543,  545,  550,  561 
cubic  283,  408 

curl  viii,  13,  242,  243,  349,  507 
curve  172,  176,  254,  571 
bell  295 

characteristic  15,  24,  30,  31,  47,  49, 
176,  177,  178 
lifted  49 
closed  152,  243 
simple  152 
nodal  497,  551 
curved  boundary  411,  412 
curvilinear  coordinate  364,  435 
cusp  254 

cut  locus  511,  515,  525 
cylinder  450,  452,  467,  508,  527,  536, 
537,  547,  550 
cylindrical 
angle  509 

coordinates  viii,  3,  503,  508,  523,  527, 
530,  536,  547 
shell  450 
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symmetry  517 
cymbal  144 

d’Alembert 

formula  16,  121,  140,  146,  260,  427, 
552,  558 

solution  53,  201,  324,  487 
dam  61 

damped  62,  200 
heat  equation  65 
transport  equation  49 
wave  equation  207 
data  363,  410 
de  Broglie  relation  287 
death  129 

decay  22,  105,  276,  285,  387,  441,  443, 
479,  482,  536,  544 
entropic  291 
exponential  127 
decimal  expansion  108 
deep  water  331 
deflection  249 
deformable  body  341 
deformation  121 
degree  306,  509 
delta 

comb  229,  233 
distribution  215,  217 
function  63,  215,  217,  221,  223,  225, 
228,  233,  246,  249,  270,  276,  277, 
280,  281,  292,  293,  321,  326,  358, 
379,  405,  441,  479,  481,  483,  485, 
521,  544,  552,  554 
three-dimensional  527 
two-dimensional  246,  255 
impulse  221,  234,  248,  291,  292,  387 
wave  558 

denoise  128,  296,  441 
dense  107,  344 
subspace  344,  346,  371 
density  49,  122,  124,  132,  142,  253,  344, 
357,  438,  486,  488,  492,  495 
conserved  38,  46,  201,  256 
momentum  61 
probability  108,  286,  564 
dependent  variable  3 
deposit  299 
depression  31 

derivative  1,  5,  105,  182,  223,  236,  275, 
342,  576 

of  delta  function  233,  277 
of  Fourier  series  94 
left-hand  81 
logarithmic  336 
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normal  7,  153,  245,  250,  436,  504, 

528,  529 
one-sided  576 
partial  viii,  1,  3,  521 
mixed  5,  8,  50,  242,  556 
radial  554 
right-hand  81 
of  series  101 
operator  354 
ordinary  1,  3 
space  291 

descent  503,  551,  561,  564 
determinant  ix 
Jacobian  58,  255 
deviation  287,  295 
diameter  499,  500 
difference 
backward  182 
centered  184,  198 
finite  vii,  9,  181,  182,  185,  186,  213, 
214,  399,  407,  411,  422 
forward  182 
quotient  182 
differentiable  80,  285 
infinitely  5,  75,  105,  128,  158 
nowhere  329 
differential 

equation  1,  181,  342,  350,  427,  428, 
585 

stochastic  viii,  299 
system  of  2,  66 
zenith  510 

see  also  ordinary  differential  equation, 
partial  differential  equation 
geometry  63 

operator  9,  64,  339,  350,  371 
Bessel  366,  374 
partial  2,  50 

Sturm-Liouville  364,  365,  480 
differentiation  viii,  233,  263,  276,  556 
implicit  49 
numerical  181 
operator  286 

diffusion  172,  299,  311,  315,  385,  388, 
435,  436,  503,  535 
chemical  123 
coefficient  129,  307 
equation  123,  315,  340,  395,  438,  439, 
543 

nonlinear  2,  122,  291,  437,  442 
of  set  485 

process  121,  129,  386 
diffusive  transport  equation  194 


diffusivity  307,  537 
thermal  124,  134,  186,  293,  298,  438, 
535 

Dilation  Theorem  271,  274 
dimension  2,  112,  577 
eigenspace  583 

finite  ix,  11,  98,  109,  215,  220,  400, 
410,  430 

infinite  ix,  11,  99,  109,  215,  340,  342, 
371,  400,  577 

Dirac 
comb  229 
delta  function  217 
equation  vii 
direct  method  255 
Dirichlet 

boundary  condition  7,  123,  141,  147, 
153,  166,  186,  201,  343,  345,  347, 
359,  364,  368,  404,  412,  436,  439, 
441,  446,  486,  488,  504,  521,  522, 
527,  535,  537,  540,  544,  546 
boundary  value  problem  125,  132, 
160,  207,  213,  245,  254,  383,  410, 
416,  442,  450,  474,  508,  528,  531, 
533,  547 

eigenvalue  373,  377 
functional  410,  416,  424 
integral  368,  506 
principle  368,  400,  443,  506 
disconnected  17 
discontinuity  37,  193,  223,  441 
jump  80,  81,  82,  96,  164,  223,  233, 
236,  405,  432 
removable  80 

discontinuous  15,  76,  102,  427 
initial  data  292 
discrete  Fourier  transform  582 
discriminant  172,  173,  174 
disease  123 

disk  121,  160,  167,  249,  251,  253,  374, 
415,  427,  444,  445,  450,  467,  479, 
490,  499,  500,  508 
half  260,  480,  494,  496 
metal  166,  479 
quarter  494 
semi-circular  170,  418 
unit  155,  166 
dislocation  256,  427 
dispersion  vii,  292,  324,  330 
relation  330 
dispersive  329,  396 
equation  328,  486 
medium  2 

quantization  328,  329 
tail  337 

wave  2,  324,  459 
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displacement  49,  216,  341,  351,  486 
initial  53,  59,  145,  487,  546,  547,  551, 
554,  556,  557,  560,  561,  562 
radial  546 
dissipative  315 
dissonant  144,  490,  496,  549 
distance  307,  578 
distribution  ix,  215,  217,  553 
Gaussian  295 
distributive  575 
disturbance  15,  22,  179,  292 
diverge  98 

divergence  viii,  13,  242,  244,  359,  437, 
439,  505,  535 
operator  347 
theorem  viii,  505,  529 
division  263 

domain  ix,  5,  17,  152,  207,  243,  245, 

248,  253,  339,  341,  345,  359,  486, 
504,  571 

of  dependence  59,  197,  199 
of  influence  56 
irregular  213 
rescaled  495 
space  401,  585,  586 
dominant 
frequency  495 
mode  499 

dot  product  viii,  73,  341,  346,  354,  372, 
578,  585 
Hermitian  580 
double 

Fourier  series  488 
integral  viii,  58,  243,  245,  248,  251, 
346,  422,  432,  437,  448 

o 

weighted  L  norm  381 
L2  norm  383,  525 
doublet  558 

doubly  infinite  series  89,  91 
driver  44 

drum  63,  144,  152,  153,  214,  486,  487, 
490,  495,  496 

circular  154,  160,  490,  494,  496,  499 
rectangular  499 
square  493 

du  Bois-Reymond  lemma  431,  434 
duality  221,  247,  286,  553 
Duhamel’s  principle  298 
DVD  63 
dynamical  3 

partial  differential  equation  291,  551 
process  7,  15,  172 
system  340,  385 


dynamics  46,  47,  49,  340,  435 
fluid  291,  315 
gas  15,  31,  315 
shock  15,  431 
ear  144 

Earth  136,  137,  508,  530,  537,  545,  549, 
560,  561 

earthquake  549 
echo  chamber  563 
ecology  15 
economics  vii,  4 

eigenequation  66,  67,  371,  439,  583 
eigenfunction  67,  74,  110,  125,  190,  340, 
371,  374,  375,  376,  378,  387,  395, 
439,  445,  515,  517,  535,  540,  546, 
547,  549,  565,  566 
atomic  568,  570 
expansion  340,  379 
Fourier-Bessel  477 
null  70,  132,  145,  386,  387,  389,  441, 
487,  547,  550 

series  109,  371,  378,  386,  435,  441, 
443,  535,  544 
Sturm-Liouville  382 
eigenmode  389,  497,  499,  546 
eigensolution  66,  67,  125,  140,  389,  395, 
447,  475,  487,  566,  584 
series  440 

eigenspace  371,  382,  517,  568,  583 
eigenstate  568 

eigenvalue  ix,  2,  66,  67,  125,  190,  336, 
340,  371,  376,  378,  387,  389,  395, 
487,  517,  535,  541,  546,  549,  565, 
568,  582-4 
complex  372 

equation  66,  67,  371,  439,  582 

Dirichlet  373,  377 

Helmholtz  377,  383,  446,  474,  535, 

546 

multiplicity  582 
null  131,  439,  582 
problem  130,  371,  446 
simple  372,  391,  582 
zero  131,  439,  582 

eigenvector  ix,  66,  371,  372,  375,  582, 
583,  584 

basis  66,  583,  584 
null  582 

eikonal  equation  vii 
Einstein  equation  vii 
elastic  550 
ball  504 

bar  15,  49,  234,  241,  351,  357 
beam  146,  393 
media  427 
plate  324,  486 


614 


Subject  Index 


elastic  ( continued ) 
vibration  486 
wave  121 

elasticity  vii,  2,  121,  175,  504 
elastodynamics  486,  549 
elastomechanics  341 
electric 

charge  256,  529,  531,  564 
field  341,  504,  546,  551 
potential  504 
electromagnetic  254,  395 
potential  2,  564 
vibration  2 

wave  15,  121,  388,  503,  546,  551 
electromagnetism  vii,  121,  154,  341,  504 
electromotive  force  504 
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567,  569 

electronic  music  63 
electrostatic  242,  249,  531 
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force  504,  529 

potential  152,  249,  252,  256,  503,  522, 
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affine  414,  416 
boundary  424 
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410,  411,  416,  427,  430,  431 
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elevation  31 
ellipse  173 
ellipsoid  511 

ellipsoidal  coordinates  508 
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boundary  value  problem  207,  216, 
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orbit  565 

energy  39,  286,  292,  324,  565,  568,  569 
conservation  61,  151,  535 
density  61,  122 
heat  122,  246,  304,  435-7,  482 
kinetic  61 
level  395,  503,  567 
operator  394 

potential  6,  61,  152,  242,  318,  340, 
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thermal  121,  122,  132,  134,  139,  169, 
295,  304 
total  61,  151 
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549 

enhancement  129,  442 
entropic  decay  291 
entropy  317 
condition  43,  46 
envelope  230 

Equal  Area  Rule  40,  46,  47,  431 
equation 
Airy  281,  459 
algebraic  1,  8,  428 
algebraic  differential  455 
backwards  heat  129,  299,  442 
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X-ray  546 
xylophone  144 


636 


Subject  Index 


yard  307 
zenith 

angle  508,  515 
differential  equation  510 
zero 

complex  87 
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